linux/kernel
Waiman Long 1c4941fd53 locking/pvqspinlock: Allow limited lock stealing
This patch allows one attempt for the lock waiter to steal the lock
when entering the PV slowpath. To prevent lock starvation, the pending
bit will be set by the queue head vCPU when it is in the active lock
spinning loop to disable any lock stealing attempt.  This helps to
reduce the performance penalty caused by lock waiter preemption while
not having much of the downsides of a real unfair lock.

The pv_wait_head() function was renamed as pv_wait_head_or_lock()
as it was modified to acquire the lock before returning. This is
necessary because of possible lock stealing attempts from other tasks.

Linux kernel builds were run in KVM guest on an 8-socket, 4
cores/socket Westmere-EX system and a 4-socket, 8 cores/socket
Haswell-EX system. Both systems are configured to have 32 physical
CPUs. The kernel build times before and after the patch were:

                    Westmere                    Haswell
  Patch         32 vCPUs    48 vCPUs    32 vCPUs    48 vCPUs
  -----         --------    --------    --------    --------
  Before patch   3m15.6s    10m56.1s     1m44.1s     5m29.1s
  After patch    3m02.3s     5m00.2s     1m43.7s     3m03.5s

For the overcommited case (48 vCPUs), this patch is able to reduce
kernel build time by more than 54% for Westmere and 44% for Haswell.

Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Douglas Hatch <doug.hatch@hpe.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Scott J Norton <scott.norton@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1447190336-53317-1-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-12-04 11:39:51 +01:00
..
bpf bpf, verifier: annotate verbose printer with __printf 2015-11-03 11:29:56 -05:00
configs
debug
events Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-15 09:36:24 -08:00
gcov gcov: add support for GCC 5.1 2015-06-30 19:44:57 -07:00
irq Merge branches 'irq-urgent-for-linus' and 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-15 09:30:48 -08:00
livepatch livepatch: x86: fix relocation computation with kASLR 2015-11-11 17:36:04 +01:00
locking locking/pvqspinlock: Allow limited lock stealing 2015-12-04 11:39:51 +01:00
power mm, page_alloc: rename __GFP_WAIT to __GFP_RECLAIM 2015-11-06 17:50:42 -08:00
printk printk: prevent userland from spoofing kernel messages 2015-11-06 17:50:42 -08:00
rcu Merge branches 'doc.2015.10.06a', 'percpu-rwsem.2015.10.06a' and 'torture.2015.10.06a' into HEAD 2015-10-07 16:06:25 -07:00
sched sched/core, locking: Document Program-Order guarantees 2015-12-04 10:33:41 +01:00
time Merge branches 'irq-urgent-for-linus' and 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-15 09:30:48 -08:00
trace This contains three more clean up patches. 2015-11-12 16:22:54 -08:00
.gitignore certs: add .gitignore to stop git nagging about x509_certificate_list 2015-10-21 15:18:35 +01:00
acct.c
async.c
audit.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
audit.h audit: audit_tree_match can be boolean 2015-11-04 08:23:51 -05:00
audit_fsnotify.c audit: clean simple fsnotify implementation 2015-08-06 16:14:53 -04:00
audit_tree.c audit: audit_tree_match can be boolean 2015-11-04 08:23:51 -05:00
audit_watch.c Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit 2015-09-08 13:34:59 -07:00
auditfilter.c audit: fix comment block whitespace 2015-11-04 08:23:51 -05:00
auditsc.c Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit 2015-09-08 13:34:59 -07:00
backtracetest.c
bounds.c
capability.c
cgroup.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
cgroup_freezer.c cgroup: allow a cgroup subsystem to reject a fork 2015-07-14 17:29:23 -04:00
cgroup_pids.c cgroup: add cgroup_subsys->free() method and use it to fix pids controller 2015-10-15 16:41:53 -04:00
compat.c
configs.c
context_tracking.c context_tracking: avoid irq_save/irq_restore on guest entry and exit 2015-11-10 12:06:23 +01:00
cpu.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-03 18:03:50 -08:00
cpu_pm.c kernel/cpu_pm: fix cpu_cluster_pm_exit comment 2015-09-03 02:42:20 +02:00
cpuset.c Merge branch 'akpm' (patches from Andrew) 2015-11-05 23:10:54 -08:00
crash_dump.c
cred.c kernel/cred.c: remove unnecessary kdebug atomic reads 2015-09-10 13:29:01 -07:00
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-03 18:03:50 -08:00
extable.c kernel/extable.c: remove duplicated include 2015-09-10 13:29:01 -07:00
fork.c Merge branch 'akpm' (patches from Andrew) 2015-11-05 23:10:54 -08:00
freezer.c
futex.c driver core update for 4.4-rc1 2015-11-04 21:50:37 -08:00
futex_compat.c
groups.c
hung_task.c
irq_work.c
jump_label.c locking/static_keys: Add selftest 2015-08-03 11:34:16 +02:00
kallsyms.c
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kexec.c kexec: use file name as the output message prefix 2015-11-06 17:50:42 -08:00
kexec_core.c kexec: use file name as the output message prefix 2015-11-06 17:50:42 -08:00
kexec_file.c kexec: use file name as the output message prefix 2015-11-06 17:50:42 -08:00
kexec_internal.h kexec: split kexec_file syscall code to kexec_file.c 2015-09-10 13:29:01 -07:00
kmod.c kmod: don't run async usermode helper as a child of kworker thread 2015-10-23 17:55:10 +09:00
kprobes.c perf/x86/hw_breakpoints: Disallow kernel breakpoints unless kprobe-safe 2015-08-04 10:16:54 +02:00
ksysfs.c kexec: split kexec_load syscall from kexec core code 2015-09-10 13:29:01 -07:00
kthread.c kernel/kthread.c:kthread_create_on_node(): clarify documentation 2015-09-04 16:54:41 -07:00
latencytop.c
Makefile sys_membarrier(): system-wide memory barrier (generic, x86) 2015-09-11 15:21:34 -07:00
membarrier.c sys_membarrier(): system-wide memory barrier (generic, x86) 2015-09-11 15:21:34 -07:00
memremap.c libnvdimm for 4.4: 2015-11-10 12:07:22 -08:00
module-internal.h
module.c module: Fix locking in symbol_put_addr() 2015-08-24 10:37:01 +09:30
module_signing.c KEYS: Merge the type-specific data with the payload data 2015-10-21 15:18:36 +01:00
notifier.c Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-09-01 08:40:25 -07:00
nsproxy.c
padata.c
panic.c kernel/panic.c: turn off locks debug before releasing console lock 2015-11-20 16:17:32 -08:00
params.c Nothing exciting, minor tweaks and cleanups. 2015-11-09 15:53:39 -08:00
pid.c rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN() 2015-07-22 15:27:32 -07:00
pid_namespace.c
profile.c mm: rename alloc_pages_exact_node() to __alloc_pages_node() 2015-09-08 15:35:28 -07:00
ptrace.c seccomp, ptrace: add support for dumping seccomp filters 2015-10-27 19:55:13 -07:00
range.c
reboot.c kexec: split kexec_load syscall from kexec core code 2015-09-10 13:29:01 -07:00
relay.c kernel/relay.c: use kvfree() in relay_free_page_array() 2015-06-30 19:44:59 -07:00
resource.c mm: enhance region_is_ram() to region_intersects() 2015-08-10 23:07:05 -04:00
seccomp.c seccomp, ptrace: add support for dumping seccomp filters 2015-10-27 19:55:13 -07:00
signal.c kernel/signal.c: unexport sigsuspend() 2015-11-20 16:17:32 -08:00
smp.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
smpboot.c stop_machine: Kill smp_hotplug_thread->pre_unpark, introduce stop_machine_unpark() 2015-10-20 10:23:55 +02:00
smpboot.h
softirq.c
stacktrace.c
stop_machine.c sched: Move cpu_active() tests from stop_two_cpus() into migrate_swap_stop() 2015-10-20 10:25:56 +02:00
sys.c pidns: fix set/getpriority and ioprio_set/get in PRIO_USER mode 2015-11-06 17:50:42 -08:00
sys_ni.c mm: mlock: add new mlock system call 2015-11-05 19:34:48 -08:00
sysctl.c kernel/watchdog.c: add sysctl knob hardlockup_panic 2015-11-05 19:34:48 -08:00
sysctl_binary.c
task_work.c task_work: remove fifo ordering guarantee 2015-09-05 13:46:58 -07:00
taskstats.c
test_kprobes.c
torture.c torture: Consolidate cond_resched_rcu_qs() into stutter_wait() 2015-10-06 11:25:01 -07:00
tracepoint.c tracepoint: Give priority to probes of tracepoints 2015-10-25 21:33:54 -04:00
tsacct.c
uid16.c
up.c
user-return-notifier.c
user.c
user_namespace.c capabilities: ambient capabilities 2015-09-04 16:54:41 -07:00
utsname.c
utsname_sysctl.c
watchdog.c kernel/watchdog.c: fix race between proc_watchdog_thresh() and watchdog_timer_fn() 2015-11-05 19:34:48 -08:00
workqueue.c Merge branch 'for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2015-11-05 14:16:27 -08:00
workqueue_internal.h