linux/kernel
Thomas Gleixner 7f805d17f1 bpf: Prepare hashtab locking for PREEMPT_RT
PREEMPT_RT forbids certain operations like memory allocations (even with
GFP_ATOMIC) from atomic contexts. This is required because even with
GFP_ATOMIC the memory allocator calls into code pathes which acquire locks
with long held lock sections. To ensure the deterministic behaviour these
locks are regular spinlocks, which are converted to 'sleepable' spinlocks
on RT. The only true atomic contexts on an RT kernel are the low level
hardware handling, scheduling, low level interrupt handling, NMIs etc. None
of these contexts should ever do memory allocations.

As regular device interrupt handlers and soft interrupts are forced into
thread context, the existing code which does
  spin_lock*(); alloc(GPF_ATOMIC); spin_unlock*();
just works.

In theory the BPF locks could be converted to regular spinlocks as well,
but the bucket locks and percpu_freelist locks can be taken from arbitrary
contexts (perf, kprobes, tracepoints) which are required to be atomic
contexts even on RT. These mechanisms require preallocated maps, so there
is no need to invoke memory allocations within the lock held sections.

BPF maps which need dynamic allocation are only used from (forced) thread
context on RT and can therefore use regular spinlocks which in turn allows
to invoke memory allocations from the lock held section.

To achieve this make the hash bucket lock a union of a raw and a regular
spinlock and initialize and lock/unlock either the raw spinlock for
preallocated maps or the regular variant for maps which require memory
allocations.

On a non RT kernel this distinction is neither possible nor required.
spinlock maps to raw_spinlock and the extra code and conditional is
optimized out by the compiler. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200224145644.509685912@linutronix.de
2020-02-24 16:20:10 -08:00
..
bpf bpf: Prepare hashtab locking for PREEMPT_RT 2020-02-24 16:20:10 -08:00
cgroup Merge branch 'for-5.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2020-02-10 17:07:05 -08:00
configs
debug Revert "kdb: Get rid of confusing diag msg from "rd" if current task has no regs" 2020-02-06 11:40:09 +00:00
dma dma-direct: improve DMA mask overflow reporting 2020-02-05 18:53:41 +01:00
events perf/bpf: Remove preempt disable around BPF invocation 2020-02-24 16:18:20 -08:00
gcov Revert "um: Enable CONFIG_CONSTRUCTORS" 2020-01-19 22:42:06 +01:00
irq A set of fixes for X86: 2020-02-09 12:11:12 -08:00
livepatch New tracing features: 2019-11-27 11:42:01 -08:00
locking proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
power ACPI: PM: s2idle: Avoid possible race related to the EC GPE 2020-02-11 10:11:02 +01:00
printk printk: fix exclusive_console replaying 2020-01-02 16:15:04 +01:00
rcu rcu: Forgive slow expedited grace periods at boot time 2020-01-25 12:00:40 -08:00
sched Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2020-02-15 12:51:22 -08:00
time y2038: remove unused time32 interfaces 2020-02-21 11:22:15 -08:00
trace bpf/trace: Remove redundant preempt_disable from trace_call_bpf() 2020-02-24 16:18:20 -08:00
.gitignore
acct.c acct: stop using get_seconds() 2019-12-18 18:07:31 +01:00
async.c
audit.c audit: Add __rcu annotation to RCU pointer 2019-12-09 15:19:03 -05:00
audit.h
audit_fsnotify.c
audit_tree.c
audit_watch.c audit_get_nd(): don't unlock parent too early 2019-11-10 11:56:55 -05:00
auditfilter.c
auditsc.c Revert "bpf: Emit audit messages upon successful prog load and unload" 2019-11-23 09:56:02 -08:00
backtracetest.c
bounds.c
capability.c
compat.c y2038: remove unused time32 interfaces 2020-02-21 11:22:15 -08:00
configs.c proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
context_tracking.c context_tracking: Rename context_tracking_is_enabled() => context_tracking_enabled() 2019-10-29 10:01:12 +01:00
cpu.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2020-01-28 10:07:09 -08:00
cpu_pm.c
crash_core.c
crash_dump.c
cred.c Merge branch 'dhowells' (patches from DavidH) 2020-01-14 09:56:31 -08:00
delayacct.c
dma.c
elfcore.c kernel/elfcore.c: include proper prototypes 2019-09-25 17:51:39 -07:00
exec_domain.c
exit.c for-linus-2020-01-03 2020-01-03 11:17:14 -08:00
extable.c bpf: Allow to resolve bpf trampoline and dispatcher in unwind 2020-01-25 07:12:40 -08:00
fail_function.c
fork.c hmm related patches for 5.6 2020-01-29 19:56:50 -08:00
freezer.c Revert "libata, freezer: avoid block device removal while system is frozen" 2019-10-06 09:11:37 -06:00
futex.c futex: Fix kernel-doc notation warning 2020-01-09 13:23:40 +01:00
gen_kheaders.sh kheaders: explain why include/config/autoconf.h is excluded from md5sum 2019-11-11 20:10:01 +09:00
groups.c
hung_task.c
iomem.c
irq_work.c irq_work: Fix IRQ_WORK_BUSY bit clearing 2019-11-15 10:48:37 +01:00
jump_label.c jump_label: Don't warn on __exit jump entries 2019-08-29 15:10:10 +01:00
kallsyms.c Kbuild updates for v5.6 (2nd) 2020-02-09 16:05:50 -08:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks sched/rt, locking: Use CONFIG_PREEMPTION 2019-12-08 14:37:36 +01:00
Kconfig.preempt sched/Kconfig: Fix spelling mistake in user-visible help text 2019-11-12 11:35:32 +01:00
kcov.c kcov: remote coverage support 2019-12-04 19:44:14 -08:00
kexec.c kexec: add machine_kexec_post_load() 2020-01-08 16:32:55 +00:00
kexec_core.c kexec: add machine_kexec_post_load() 2020-01-08 16:32:55 +00:00
kexec_elf.c kexec_elf: support 32 bit ELF files 2019-09-06 23:58:44 +02:00
kexec_file.c kexec: add machine_kexec_post_load() 2020-01-08 16:32:55 +00:00
kexec_internal.h kexec: add machine_kexec_post_load() 2020-01-08 16:32:55 +00:00
kheaders.c
kmod.c
kprobes.c kprobes: Fix optimize_kprobe()/unoptimize_kprobe() cancellation logic 2020-01-09 12:40:13 +01:00
ksysfs.c
kthread.c kthread: make __kthread_queue_delayed_work static 2019-10-16 09:20:58 -07:00
latencytop.c proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
Makefile kcov: ignore fault-inject and stacktrace 2020-01-31 10:30:41 -08:00
module-internal.h
module.c proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
module_signature.c
module_signing.c
notifier.c kernel/notifier.c: remove blocking_notifier_chain_cond_register() 2019-12-04 19:44:12 -08:00
nsproxy.c ns: Introduce Time Namespace 2020-01-14 12:20:48 +01:00
padata.c padata: update documentation 2019-12-11 16:37:02 +08:00
panic.c locking/refcount: Remove unused 'refcount_error_report()' function 2019-11-25 09:15:42 +01:00
params.c lockdown: Lock down module params that specify hardware parameters (eg. ioport) 2019-08-19 21:54:16 -07:00
pid.c pid: Implement pidfd_getfd syscall 2020-01-13 21:49:36 +01:00
pid_namespace.c fork: extend clone3() to support setting a PID 2019-11-15 23:49:22 +01:00
profile.c proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
ptrace.c ptrace: reintroduce usage of subjective credentials in ptrace_has_cap() 2020-01-18 13:51:39 +01:00
range.c
reboot.c
relay.c
resource.c mm/memory_hotplug.c: use PFN_UP / PFN_DOWN in walk_system_ram_range() 2019-09-24 15:54:09 -07:00
rseq.c rseq: Reject unknown flags on rseq unregister 2019-12-25 10:41:20 +01:00
seccomp.c bpf: Use bpf_prog_run_pin_on_cpu() at simple call sites. 2020-02-24 16:20:09 -08:00
signal.c sched.h: Annotate sighand_struct with __rcu 2020-01-26 10:54:47 +01:00
smp.c smp: Remove superfluous cond_func check in smp_call_function_many_cond() 2020-01-28 15:43:00 +01:00
smpboot.c
smpboot.h
softirq.c
stackleak.c
stacktrace.c stacktrace: Get rid of unneeded '!!' pattern 2019-11-11 10:30:59 +01:00
stop_machine.c stop_machine: Make stop_cpus() static 2020-01-17 10:19:21 +01:00
sys.c prctl: PR_{G,S}ET_IO_FLUSHER to support controlling memory reclaim 2020-01-28 10:09:51 +01:00
sys_ni.c y2038: allow disabling time32 system calls 2019-11-15 14:38:30 +01:00
sysctl-test.c kunit: allow kunit tests to be loaded as a module 2020-01-09 16:42:29 -07:00
sysctl.c rcu: Make PREEMPT_RCU be a modifier to TREE_RCU 2019-12-09 12:37:51 -08:00
sysctl_binary.c sysctl: Remove the sysctl system call 2019-11-26 13:03:56 -06:00
task_work.c
taskstats.c taskstats: fix data-race 2019-12-04 15:18:39 +01:00
test_kprobes.c
torture.c
tracepoint.c
tsacct.c tsacct: add 64-bit btime field 2019-12-18 18:07:31 +01:00
ucount.c
uid16.c
uid16.h
umh.c
up.c smp/up: Make smp_call_function_single() match SMP semantics 2020-02-07 15:34:12 +01:00
user-return-notifier.c
user.c
user_namespace.c
utsname.c
utsname_sysctl.c
watchdog.c watchdog/softlockup: Enforce that timestamp is valid on boot 2020-01-17 11:19:22 +01:00
watchdog_hld.c
workqueue.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2020-01-28 10:07:09 -08:00
workqueue_internal.h