linux/kernel
Mark Rutland cbad0fb2d8 ftrace: Add DYNAMIC_FTRACE_WITH_CALL_OPS
Architectures without dynamic ftrace trampolines incur an overhead when
multiple ftrace_ops are enabled with distinct filters. in these cases,
each call site calls a common trampoline which uses
ftrace_ops_list_func() to iterate over all enabled ftrace functions, and
so incurs an overhead relative to the size of this list (including RCU
protection overhead).

Architectures with dynamic ftrace trampolines avoid this overhead for
call sites which have a single associated ftrace_ops. In these cases,
the dynamic trampoline is customized to branch directly to the relevant
ftrace function, avoiding the list overhead.

On some architectures it's impractical and/or undesirable to implement
dynamic ftrace trampolines. For example, arm64 has limited branch ranges
and cannot always directly branch from a call site to an arbitrary
address (e.g. from a kernel text address to an arbitrary module
address). Calls from modules to core kernel text can be indirected via
PLTs (allocated at module load time) to address this, but the same is
not possible from calls from core kernel text.

Using an indirect branch from a call site to an arbitrary trampoline is
possible, but requires several more instructions in the function
prologue (or immediately before it), and/or comes with far more complex
requirements for patching.

Instead, this patch adds a new option, where an architecture can
associate each call site with a pointer to an ftrace_ops, placed at a
fixed offset from the call site. A shared trampoline can recover this
pointer and call ftrace_ops::func() without needing to go via
ftrace_ops_list_func(), avoiding the associated overhead.

This avoids issues with branch range limitations, and avoids the need to
allocate and manipulate dynamic trampolines, making it far simpler to
implement and maintain, while having similar performance
characteristics.

Note that this allows for dynamic ftrace_ops to be invoked directly from
an architecture's ftrace_caller trampoline, whereas existing code forces
the use of ftrace_ops_get_list_func(), which is in part necessary to
permit the ftrace_ops to be freed once unregistered *and* to avoid
branch/address-generation range limitation on some architectures (e.g.
where ops->func is a module address, and may be outside of the direct
branch range for callsites within the main kernel image).

The CALL_OPS approach avoids this problems and is safe as:

* The existing synchronization in ftrace_shutdown() using
  ftrace_shutdown() using synchronize_rcu_tasks_rude() (and
  synchronize_rcu_tasks()) ensures that no tasks hold a stale reference
  to an ftrace_ops (e.g. in the middle of the ftrace_caller trampoline,
  or while invoking ftrace_ops::func), when that ftrace_ops is
  unregistered.

  Arguably this could also be relied upon for the existing scheme,
  permitting dynamic ftrace_ops to be invoked directly when ops->func is
  in range, but this will require additional logic to handle branch
  range limitations, and is not handled by this patch.

* Each callsite's ftrace_ops pointer literal can hold any valid kernel
  address, and is updated atomically. As an architecture's ftrace_caller
  trampoline will atomically load the ops pointer then dereference
  ops->func, there is no risk of invoking ops->func with a mismatches
  ops pointer, and updates to the ops pointer do not require special
  care.

A subsequent patch will implement architectures support for arm64. There
should be no functional change as a result of this patch alone.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Florent Revest <revest@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230123134603.1064407-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-24 11:49:42 +00:00
..
bpf bpf: Always use maximal size for copy_array() 2022-12-28 14:54:53 -08:00
cgroup MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
configs mm, slob: rename CONFIG_SLOB to CONFIG_SLOB_DEPRECATED 2022-12-01 00:09:20 +01:00
debug kdb: use srcu console list iterator 2022-12-02 11:25:00 +01:00
dma dma-mapping: reject GFP_COMP for noncoherent allocations 2022-12-21 08:45:38 +01:00
entry entry: kmsan: introduce kmsan_unpoison_entry_regs() 2022-10-03 14:03:25 -07:00
events perf/core: Call LSM hook after copying perf_event_attr 2022-12-27 12:44:01 +01:00
futex - Prevent the leaking of a debug timer in futex_waitv() 2023-01-01 11:15:05 -08:00
gcov gcov: add support for checksum field 2022-12-21 14:31:52 -08:00
irq genirq/msi: Return MSI_XA_DOMAIN_SIZE as the maximum MSI index when no domain is present 2022-12-16 14:04:04 +00:00
kcsan hardening updates for v6.2-rc1 2022-12-14 12:20:00 -08:00
livepatch modules changes for v6.2-rc1 2022-12-13 14:05:39 -08:00
locking - Prevent the leaking of a debug timer in futex_waitv() 2023-01-01 11:15:05 -08:00
module powerpc updates for 6.2 2022-12-19 07:13:33 -06:00
power PM: sleep: Refine error message in try_to_freeze_tasks() 2022-12-06 12:04:34 +01:00
printk Merge branch 'rework/console-list-lock' into for-linus 2022-12-08 11:46:56 +01:00
rcu Urgent RCU pull request for v6.2 2022-12-21 07:59:57 -08:00
sched hardening updates for v6.2-rc1 2022-12-14 12:20:00 -08:00
time Random number generator updates for Linux 6.2-rc1. 2022-12-12 16:22:22 -08:00
trace ftrace: Add DYNAMIC_FTRACE_WITH_CALL_OPS 2023-01-24 11:49:42 +00:00
.gitignore
acct.c acct: fix potential integer overflow in encode_comp_t() 2022-11-30 16:13:18 -08:00
async.c
audit.c audit: use time_after to compare time 2022-08-29 19:47:03 -04:00
audit.h audit: remove selinux_audit_rule_update() declaration 2022-09-07 11:30:15 -04:00
audit_fsnotify.c audit: fix potential double free on error path from fsnotify_add_inode_mark 2022-08-22 18:50:06 -04:00
audit_tree.c
audit_watch.c audit_init_parent(): constify path 2022-09-01 17:39:30 -04:00
auditfilter.c
auditsc.c audit: unify audit_filter_{uring(), inode_name(), syscall()} 2022-10-17 14:24:42 -04:00
backtracetest.c
bounds.c mm: multi-gen LRU: minimal implementation 2022-09-26 19:46:09 -07:00
capability.c caps: use type safe idmapping helpers 2022-10-26 10:02:39 +02:00
cfi.c cfi: Switch to -fsanitize=kcfi 2022-09-26 10:13:13 -07:00
compat.c
configs.c
context_tracking.c
cpu.c cpu/hotplug: Do not bail-out in DYING/STARTING sections 2022-12-02 12:43:02 +01:00
cpu_pm.c
crash_core.c vmcoreinfo: warn if we exceed vmcoreinfo data size 2022-11-30 16:13:17 -08:00
crash_dump.c
cred.c cred: Do not default to init_cred in prepare_kernel_cred() 2022-11-01 10:04:52 -07:00
delayacct.c delayacct: support re-entrance detection of thrashing accounting 2022-09-26 19:46:07 -07:00
dma.c
exec_domain.c
exit.c exit: Use READ_ONCE() for all oops/warn limit reads 2022-12-16 12:26:57 -08:00
extable.c
fail_function.c fail_function: fix wrong use of fei_attr_remove() 2022-09-11 21:55:11 -07:00
fork.c New Feature: 2022-12-17 14:06:53 -06:00
freezer.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
gen_kheaders.sh kbuild: build init/built-in.a just once 2022-09-29 04:40:15 +09:00
groups.c security: Add LSM hook to setgroups() syscall 2022-07-15 18:21:49 +00:00
hung_task.c sched: Fix more TASK_state comparisons 2022-09-30 16:50:39 +02:00
iomem.c
irq_work.c
jump_label.c jump_label: Prevent key->enabled int overflow 2022-12-01 15:53:05 -08:00
kallsyms.c kallsyms: Add self-test facility 2022-11-15 00:42:02 -08:00
kallsyms_internal.h kallsyms: Reduce the memory occupied by kallsyms_seqs_of_names[] 2022-11-12 18:47:36 -08:00
kallsyms_selftest.c kallsyms: Remove unneeded semicolon 2022-11-18 12:56:40 -08:00
kallsyms_selftest.h kallsyms: Add self-test facility 2022-11-15 00:42:02 -08:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c kcov: kmsan: unpoison area->list in kcov_remote_area_put() 2022-10-03 14:03:23 -07:00
kexec.c panic, kexec: make __crash_kexec() NMI safe 2022-09-11 21:55:06 -07:00
kexec_core.c kexec: remove the unneeded result variable 2022-11-18 13:55:07 -08:00
kexec_elf.c
kexec_file.c kexec: replace crash_mem_range with range 2022-11-18 13:55:07 -08:00
kexec_internal.h panic, kexec: make __crash_kexec() NMI safe 2022-09-11 21:55:06 -07:00
kheaders.c
kmod.c
kprobes.c kprobes: kretprobe events missing on 2-core KVM guest 2022-12-15 08:48:40 +09:00
ksysfs.c kernel/ksysfs.c: export kernel cpu byteorder 2022-11-10 19:07:31 +01:00
kthread.c signal: break out of wait loops on kthread_stop() 2022-10-09 16:01:59 -07:00
latencytop.c latencytop: use the last element of latency_record of system 2022-09-11 21:55:12 -07:00
Makefile kernel hardening fixes for v6.2-rc1 2022-12-23 12:00:24 -08:00
module_signature.c
notifier.c notifier: repair slips in kernel-doc comments 2022-11-30 19:32:30 +01:00
nsproxy.c fs/exec: switch timens when a task gets a new mm 2022-10-25 15:15:52 -07:00
padata.c Kbuild updates for v6.2 2022-12-19 12:33:32 -06:00
panic.c kernel hardening fixes for v6.2-rc1 2022-12-23 12:00:24 -08:00
params.c Driver Core changes for 6.2-rc1 2022-12-16 03:54:54 -08:00
pid.c
pid_namespace.c
profile.c kernel/profile.c: simplify duplicated code in profile_setup() 2022-09-11 21:55:12 -07:00
ptrace.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
range.c
reboot.c kernel/reboot: Add SYS_OFF_MODE_RESTART_PREPARE mode 2022-10-04 15:59:36 +02:00
regset.c
relay.c relay: fix type mismatch when allocating memory in relay_create_buf() 2022-12-11 19:30:19 -08:00
resource.c Driver Core changes for 6.2-rc1 2022-12-16 03:54:54 -08:00
resource_kunit.c
rseq.c rseq: Use pr_warn_once() when deprecated/unknown ABI flags are encountered 2022-11-14 09:58:32 +01:00
scftorture.c
scs.c scs: add support for dynamic shadow call stacks 2022-11-09 18:06:35 +00:00
seccomp.c
signal.c hardening updates for v6.2-rc1 2022-12-14 12:20:00 -08:00
smp.c bitmap patches for v6.1-rc1 2022-10-10 12:49:34 -07:00
smpboot.c smpboot: use atomic_try_cmpxchg in cpu_wait_death and cpu_report_death 2022-09-11 21:55:10 -07:00
smpboot.h
softirq.c
stackleak.c
stacktrace.c
static_call.c
static_call_inline.c static_call: Add call depth tracking support 2022-10-17 16:41:16 +02:00
stop_machine.c
sys.c Random number generator updates for Linux 6.1-rc1. 2022-10-10 10:41:21 -07:00
sys_ni.c kernel/sys_ni: add compat entry for fadvise64_64 2022-08-20 15:17:45 -07:00
sysctl-test.c kernel/sysctl-test: use SYSCTL_{ZERO/ONE_HUNDRED} instead of i_{zero/one_hundred} 2022-09-08 16:56:45 -07:00
sysctl.c MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
task_work.c task_work: use try_cmpxchg in task_work_add, task_work_cancel_match and task_work_run 2022-09-11 21:55:10 -07:00
taskstats.c genetlink: start to validate reserved header bytes 2022-08-29 12:47:15 +01:00
torture.c
tracepoint.c tracepoint: Optimize the critical region of mutex_lock in tracepoint_module_coming() 2022-09-26 13:01:18 -04:00
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
up.c
user-return-notifier.c
user.c kernel/user: Allow user_struct::locked_vm to be usable for iommufd 2022-11-30 20:16:49 -04:00
user_namespace.c ucounts: Split rlimit and ucount values and max values 2022-10-09 16:24:05 -07:00
usermode_driver.c
utsname.c
utsname_sysctl.c kernel/utsname_sysctl.c: Fix hostname polling 2022-10-23 12:01:01 -07:00
watch_queue.c This was a moderately busy cycle for documentation, but nothing all that 2022-08-02 19:24:24 -07:00
watchdog.c powerpc updates for 6.0 2022-08-06 16:38:17 -07:00
watchdog_hld.c
workqueue.c workqueue: Make queue_rcu_work() use call_rcu_hurry() 2022-11-30 13:17:05 -08:00
workqueue_internal.h