linux/kernel/trace
Steven Rostedt (Google) fde59ab161 tracing/filter: Call filter predicate functions directly via a switch statement
Due to retpolines, indirect calls are much more expensive than direct
calls. The filters have a select set of functions it uses for the
predicates. Instead of using function pointers to call them, create a
filter_pred_fn_call() function that uses a switch statement to call the
predicate functions directly. This gives almost a 10% speedup to the
filter logic.

Using the histogram benchmark:

Before:

 # event histogram
 #
 # trigger info: hist:keys=delta:vals=hitcount:sort=delta:size=2048 if delta > 0 [active]
 #

{ delta:        113 } hitcount:        272
{ delta:        114 } hitcount:        840
{ delta:        118 } hitcount:        344
{ delta:        119 } hitcount:      25428
{ delta:        120 } hitcount:     350590
{ delta:        121 } hitcount:    1892484
{ delta:        122 } hitcount:    6205004
{ delta:        123 } hitcount:   11583521
{ delta:        124 } hitcount:   37590979
{ delta:        125 } hitcount:  108308504
{ delta:        126 } hitcount:  131672461
{ delta:        127 } hitcount:   88700598
{ delta:        128 } hitcount:   65939870
{ delta:        129 } hitcount:   45055004
{ delta:        130 } hitcount:   33174464
{ delta:        131 } hitcount:   31813493
{ delta:        132 } hitcount:   29011676
{ delta:        133 } hitcount:   22798782
{ delta:        134 } hitcount:   22072486
{ delta:        135 } hitcount:   17034113
{ delta:        136 } hitcount:    8982490
{ delta:        137 } hitcount:    2865908
{ delta:        138 } hitcount:     980382
{ delta:        139 } hitcount:    1651944
{ delta:        140 } hitcount:    4112073
{ delta:        141 } hitcount:    3963269
{ delta:        142 } hitcount:    1712508
{ delta:        143 } hitcount:     575941

After:

 # event histogram
 #
 # trigger info: hist:keys=delta:vals=hitcount:sort=delta:size=2048 if delta > 0 [active]
 #

{ delta:        103 } hitcount:         60
{ delta:        104 } hitcount:      16966
{ delta:        105 } hitcount:     396625
{ delta:        106 } hitcount:    3223400
{ delta:        107 } hitcount:   12053754
{ delta:        108 } hitcount:   20241711
{ delta:        109 } hitcount:   14850200
{ delta:        110 } hitcount:    4946599
{ delta:        111 } hitcount:    3479315
{ delta:        112 } hitcount:   18698299
{ delta:        113 } hitcount:   62388733
{ delta:        114 } hitcount:   95803834
{ delta:        115 } hitcount:   58278130
{ delta:        116 } hitcount:   15364800
{ delta:        117 } hitcount:    5586866
{ delta:        118 } hitcount:    2346880
{ delta:        119 } hitcount:    1131091
{ delta:        120 } hitcount:     620896
{ delta:        121 } hitcount:     236652
{ delta:        122 } hitcount:     105957
{ delta:        123 } hitcount:     119107
{ delta:        124 } hitcount:      54494
{ delta:        125 } hitcount:      63856
{ delta:        126 } hitcount:      64454
{ delta:        127 } hitcount:      34818
{ delta:        128 } hitcount:      41446
{ delta:        129 } hitcount:      51242
{ delta:        130 } hitcount:      28361
{ delta:        131 } hitcount:      23926

The peak before was 126ns per event, after the peak is 114ns, and the
fastest time went from 113ns to 103ns.

Link: https://lkml.kernel.org/r/20220906225529.781407172@goodmis.org

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-09-26 13:01:10 -04:00
..
rv rv/monitors: add 'static' qualifier for local symbols 2022-09-26 13:01:09 -04:00
blktrace.c blktrace: Fix the blk_fill_rwbs() kernel-doc header 2022-07-15 13:10:04 -06:00
bpf_trace.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-06-23 12:33:24 -07:00
bpf_trace.h
error_report-traces.c
fgraph.c arm64 fixes for 5.19-rc1: 2022-06-03 14:05:34 -07:00
fprobe.c fprobe: Resolve symbols with ftrace_lookup_symbols 2022-05-10 14:42:06 -07:00
ftrace.c ftrace: Fix build warning for ops_references_rec() not used 2022-08-22 09:41:12 -04:00
ftrace_internal.h
Kconfig rv: Add Runtime Verification (RV) interface 2022-07-30 14:01:28 -04:00
kprobe_event_gen_test.c
Makefile rv: Add Runtime Verification (RV) interface 2022-07-30 14:01:28 -04:00
pid_list.c tracing: Cleanup double word in comment 2022-04-26 17:58:50 -04:00
pid_list.h
power-traces.c
preemptirq_delay_test.c
rethook.c rethook: Reject getting a rethook if RCU is not watching 2022-06-17 21:53:35 +02:00
ring_buffer.c ring-buffer: Have 32 bit time stamps use all 64 bits 2022-04-27 17:19:30 -04:00
ring_buffer_benchmark.c
rpm-traces.c
synth_event_gen_test.c
trace.c Tracing updates for 5.20 / 6.0 2022-08-05 09:41:12 -07:00
trace.h tracing: Move struct filter_pred into trace_events_filter.c 2022-09-26 13:01:10 -04:00
trace_benchmark.c tracing: Add numeric delta time to the trace event benchmark 2022-09-26 13:01:09 -04:00
trace_benchmark.h tracing: Add numeric delta time to the trace event benchmark 2022-09-26 13:01:09 -04:00
trace_boot.c tracing: Initialize integer variable to prevent garbage return value 2022-05-26 21:13:00 -04:00
trace_branch.c
trace_clock.c
trace_dynevent.c tracing: Auto generate event name when creating a group of events 2022-07-24 19:11:17 -04:00
trace_dynevent.h
trace_entries.h
trace_eprobe.c tracing/eprobe: Add eprobe filter support 2022-09-26 13:01:08 -04:00
trace_event_perf.c tracing/perf: Fix double put of trace event when init fails 2022-08-21 15:56:07 -04:00
trace_events.c tracing: Have filter accept "common_cpu" to be consistent 2022-08-21 15:56:08 -04:00
trace_events_filter.c tracing/filter: Call filter predicate functions directly via a switch statement 2022-09-26 13:01:10 -04:00
trace_events_filter_test.h
trace_events_hist.c tracing/hist: Call hist functions directly via a switch statement 2022-09-26 13:01:10 -04:00
trace_events_inject.c tracing: Support __rel_loc relative dynamic data location attribute 2021-12-06 15:37:21 -05:00
trace_events_synth.c tracing: Fix strncpy warning in trace_events_synth.c 2022-03-11 11:49:24 -05:00
trace_events_trigger.c tracing: Fix to check event_mutex is held while accessing trigger list 2022-09-06 22:26:00 -04:00
trace_events_user.c tracing/user_events: Fix syntax errors in comments 2022-07-12 17:35:11 -04:00
trace_export.c
trace_functions.c
trace_functions_graph.c
trace_hwlat.c trace/hwlat: make use of the helper function kthread_run_on_cpu() 2022-01-15 16:30:24 +02:00
trace_irqsoff.c
trace_kdb.c
trace_kprobe.c tracing: Auto generate event name when creating a group of events 2022-07-24 19:11:17 -04:00
trace_kprobe_selftest.c
trace_kprobe_selftest.h
trace_mmiotrace.c
trace_nop.c
trace_osnoise.c tracing updates for 5.19: 2022-05-29 10:31:36 -07:00
trace_output.c tracing: Remove usage of list iterator after the loop body 2022-04-27 17:19:30 -04:00
trace_output.h
trace_preemptirq.c tracing: hold caller_addr to hardirq_{enable,disable}_ip 2022-09-06 22:26:00 -04:00
trace_printk.c
trace_probe.c tracing/probes: Have kprobes and uprobes use $COMM too 2022-08-21 15:56:08 -04:00
trace_probe.h tracing/eprobe: Add eprobe filter support 2022-09-26 13:01:08 -04:00
trace_probe_tmpl.h
trace_recursion_record.c tracing: Use trace_create_file() to simplify creation of tracefs entries 2022-05-26 21:12:52 -04:00
trace_sched_switch.c sched/tracing: Append prev_state to tp args instead 2022-05-12 00:37:11 +02:00
trace_sched_wakeup.c sched/tracing: Append prev_state to tp args instead 2022-05-12 00:37:11 +02:00
trace_selftest.c tracing: Reset the function filter after completing trampoline/graph selftest 2022-05-25 16:57:37 -04:00
trace_selftest_dynamic.c
trace_seq.c
trace_stack.c
trace_stat.c
trace_stat.h
trace_synth.h
trace_syscalls.c tracing: Make tp_printk work on syscall tracepoints 2022-04-26 17:58:52 -04:00
trace_uprobe.c Tracing updates for 5.20 / 6.0 2022-08-05 09:41:12 -07:00
tracing_map.c tracing: Fix tracing_map_sort_entries() kernel-doc comment 2022-04-26 17:58:51 -04:00
tracing_map.h