linux

mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-08-05 16:54:27 +00:00

Author	SHA1	Message	Date
Tomas Glozar	a80db1f857	rtla/tests: Test timerlat -P option using actions The -P option is used to set priority of osnoise and timerlat threads. Extend the test for -P with --on-threshold calling a script that looks for running timerlat threads and checks if their priority is set correctly. As --on-threshold is only supported by timerlat at the moment, this is only implemented there so far. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Chang Yin <cyin@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/20250725133817.59237-3-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-07-28 10:22:39 -04:00
Tomas Glozar	892ae5f806	rtla/tests: Add grep checks for base test cases Checking for patterns in rtla output with grep was added to test rtla actions. Add grep checks also for base tests where applicable. Also fix trace event histogram trigger check to use the correct syntax for the command-line option so that the test passes with the grep check. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Chang Yin <cyin@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/20250725133817.59237-2-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-07-28 10:22:38 -04:00
Tomas Glozar	04f837165b	rtla/tests: Limit duration to maximum of 10s Many of the original rtla tests included durations of 1 minute and 30 seconds. Experience has shown this is unnecessary, since 10 seconds as waiting time for samples to appear. Change duration of all rtla tests to at most 10 seconds. This speeds up testing significantly. Before: $ make check All tests successful. Files=3, Tests=54, 536 wallclock secs ( 0.03 usr 0.00 sys + 20.31 cusr 22.02 csys = 42.36 CPU) Result: PASS After: $ make check ... All tests successful. Files=3, Tests=54, 196 wallclock secs ( 0.03 usr 0.01 sys + 20.28 cusr 20.68 csys = 41.00 CPU) Result: PASS Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Chang Yin <cyin@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Cc: Crystal Wood <crwood@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250626123405.1496931-9-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-07-25 16:43:57 -04:00
Tomas Glozar	4e26f84abf	rtla/tests: Add tests for actions Add a bunch of tests covering most of both --on-threshold and --on-end. Parts sensitive to implementation of hist/top are tested for both. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Chang Yin <cyin@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Cc: Crystal Wood <crwood@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250626123405.1496931-8-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-07-25 16:43:57 -04:00
Tomas Glozar	916a9c5b03	rtla/tests: Check rtla output with grep Add argument to the check command in the test suite that takes a regular expression that the output of rtla command is checked against. This allows testing for specific information in rtla output in addition to checking the return value. Two minor improvements are included: running rtla with "eval" so that arguments with spaces can be passed to it via shell quotations, and the stdout of pushd and popd is suppressed to clean up the test output. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Chang Yin <cyin@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Cc: Crystal Wood <crwood@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250626123405.1496931-7-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-07-25 16:43:57 -04:00
Tomas Glozar	3aadb65db5	rtla/timerlat: Add action on end feature Implement actions on end next to actions on threshold. A new option, --on-end is added, parallel to --on-threshold. Instead of being executed whenever a latency threshold is reached, it is executed at the end of the measurement. For example: $ rtla timerlat hist -d 5s --on-end trace will save the trace output at the end. All actions supported by --on-threshold are also supported by --on-end, except for continue, which does nothing with --on-end. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Chang Yin <cyin@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Cc: Crystal Wood <crwood@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250626123405.1496931-6-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-07-25 16:43:57 -04:00
Tomas Glozar	8d933d5c89	rtla/timerlat: Add continue action Introduce option to resume tracing after a latency threshold overflow. The option is implemented as an action named "continue". Example: $ rtla timerlat top -q -T 200 -d 1s --on-threshold \ exec,command="echo Threshold" --on-threshold continue Threshold Threshold Threshold Timer Latency ... The feature is supported for both hist and top. After the continue action is executed, processing of the list of actions is stopped and tracing is resumed. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Chang Yin <cyin@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Cc: Crystal Wood <crwood@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250626123405.1496931-5-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-07-25 16:43:57 -04:00
Tomas Glozar	3b78670e3a	rtla/timerlat_bpf: Allow resuming tracing Currently, rtla-timerlat BPF program uses a global variable stored in a .bss section to store whether tracing has been stopped. Move the information to a separate map, so that it is easily writable from userspace, and add a function that clears the value, resuming tracing after it has been stopped. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Chang Yin <cyin@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Cc: Crystal Wood <crwood@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250626123405.1496931-4-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-07-25 16:43:56 -04:00
Tomas Glozar	6ea082b171	rtla/timerlat: Add action on threshold feature Extend the functionality provided by the -t/--trace option, which triggers saving the contents of a tracefs buffer after tracing is stopped, to support implementing arbitrary actions. A new option, --on-threshold, is added, taking an argument that further specifies the action. Actions added in this patch are: - trace[,file=<filename>]: Saves tracefs buffer, optionally taking a filename. - signal,num=<sig>,pid=<pid>: Sends signal to process. "parent" might be specified instead of number to send signal to parent process. - shell,command=<command>: Execute shell command. Multiple actions may be specified and will be executed in order, including multiple actions of the same type. Trace output requested via -t and -a now adds a trace action to the end of the list. If an action fails, the following actions are not executed. For example, this command: $ rtla timerlat -T 20 --on-threshold trace \ --on-threshold shell,command="grep ipi_send timerlat_trace.txt" \ --on-threshold signal,num=2,pid=parent will send signal 2 (SIGINT) to parent process, but only if saved trace contains the text "ipi_send". This way, the feature can be used for flexible reactions on latency spikes, and allows combining rtla with other tooling like perf. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Chang Yin <cyin@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Cc: Crystal Wood <crwood@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250626123405.1496931-3-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-07-25 16:43:56 -04:00
Tomas Glozar	8b6cbcac76	rtla/timerlat: Introduce enum timerlat_tracing_mode After the introduction of BPF-based sample collection, rtla-timerlat effectively runs in one of three modes: - Pure BPF mode, with tracefs only being used to set up the timerlat tracer. Sample processing and stop on threshold are handled by BPF. - tracefs mode. BPF is unsupported or kernel is lacking the necessary trace event (osnoise:timerlat_sample). Stop on theshold is handled by timerlat tracer stopping tracing in all instances. - BPF/tracefs mixed mode - BPF is used for sample collection for top or histogram, tracefs is used for trace output and/or auto-analysis. Stop on threshold is handled both through BPF program, which stops sample collection for top/histogram and wakes up rtla, and by timerlat tracer, which stops tracing for trace output/auto-analysis instances. Add enum timerlat_tracing_mode, with three values: - TRACING_MODE_BPF - TRACING_MODE_TRACEFS - TRACING_MODE_MIXED Those represent the modes described above. A field of this type is added to struct timerlat_params, named "mode", replacing the no_bpf variable. params->mode is set in timerlat_{top,hist}_parse_args to TRACING_MODE_BPF or TRACING_MODE_MIXED based on whether trace output and/or auto-analysis is requested. timerlat_{top,hist}_main then checks if BPF is not unavailable or disabled, in that case, it sets params->mode to TRACING_MODE_TRACEFS. A condition is added to timerlat_apply_config that skips setting timerlat tracer thresholds if params->mode is TRACING_MODE_BPF (those are unnecessary, since they only turn off tracing, which is already turned off in that case, since BPF is used to collect samples). Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Chang Yin <cyin@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Cc: Crystal Wood <crwood@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250626123405.1496931-2-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-07-25 16:43:56 -04:00
Linus Torvalds	472c5f736b	tracing tools updates for v6.16: - Set distinctive value for failed tests When running "make check" that performs tests on rtla the failure is checked by examining the output. Instead have the tool return an error status if it exceeds the threadhold. - Define __NR_sched_setattr for LoongArch Define __NR_sched_setattr to allow this to build for LoongArch. - Define _GNU_SOURCE for timerlat_bpf.c Due to modifications of struct sched_attr in utils.h when _GNU_SOURCE is not defined, this can cause errors for timerlat_bpf_init() and breakage in BPF sample collection mode. -----BEGIN PGP SIGNATURE----- iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaDeTzBQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qokRAP0XJzos+uvQtkGrqiX5SB/rn1s3/tiD nZagARyiV06BAwEA+NNzqFyx/BLUwMnpx/HFTnIMGXbRVWCVAEeL3t77zgk= =dx7t -----END PGP SIGNATURE----- Merge tag 'trace-tools-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing tools updates from Steven Rostedt: - Set distinctive value for failed tests When running "make check" that performs tests on rtla the failure is checked by examining the output. Instead have the tool return an error status if it exceeds the threadhold. - Define __NR_sched_setattr for LoongArch Define __NR_sched_setattr to allow this to build for LoongArch. - Define _GNU_SOURCE for timerlat_bpf.c Due to modifications of struct sched_attr in utils.h when _GNU_SOURCE is not defined, this can cause errors for timerlat_bpf_init() and breakage in BPF sample collection mode. * tag 'trace-tools-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rtla: Define _GNU_SOURCE in timerlat_bpf.c rtla: Define __NR_sched_setattr for LoongArch rtla: Set distinctive exit value for failed tests	2025-05-29 20:59:52 -07:00
Tomas Glozar	8020361d51	rtla: Define _GNU_SOURCE in timerlat_bpf.c Newer versions of glibc include a definition of struct sched_attr in bits/sched.h (included through sched.h which is included by rtla). Commit `0eecee3406` ("tools/rtla: fix collision with glibc sched_attr/sched_set_attr") has modified the definition of struct sched_attr in utils.h, so that it is only applied with older versions of glibc that do not define it, in order to prevent build failure. The definition in bits/sched.h depends on _GNU_SOURCE. timerlat_bpf.c does not define _GNU_SOURCE, making it fall back to the definition in utils.h. The latter has two fields less, leading to shifted offsets of struct timerlat_params in timerlat_bpf_init. Because of the shift, timerlat_bpf_init incorrectly reads params->entries as 0 for timerlat-hist and disables the creation of histogram maps, causing breakage in BPF sample collection mode: $ rtla timerlat hist -d 1s Error pulling BPF data Fix the issue by also defining _GNU_SOURCE in timerlat_bpf.c. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250430144651.621766-1-tglozar@redhat.com Fixes: `e34293ddce` ("rtla/timerlat: Add BPF skeleton to collect samples") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-05-07 17:25:46 -04:00
Tiezhu Yang	6a38c51a25	rtla: Define __NR_sched_setattr for LoongArch When executing "make -C tools/tracing/rtla" on LoongArch, there exists the following error: src/utils.c:237:24: error: '__NR_sched_setattr' undeclared Just define __NR_sched_setattr for LoongArch if not exist. Link: https://lore.kernel.org/20250422074917.25771-1-yangtiezhu@loongson.cn Reported-by: Haiyong Sun <sunhaiyong@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-05-07 16:35:31 -04:00
Costa Shulyupin	18682166f6	rtla: Set distinctive exit value for failed tests A test is considered failed when a sample trace exceeds the threshold. Failed tests return the same exit code as passed tests, requiring test frameworks to determine the result by searching for "hit stop tracing" in the output. Assign a distinct exit code for failed tests to enable the use of shell expressions and seamless integration with testing frameworks without the need to parse output. Add enum type for return value. Update `make check`. Cc: Daniel Bristot de Oliveira <bristot@kernel.org> Cc: John Kacur <jkacur@redhat.com> Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com> Cc: Eder Zulian <ezulian@redhat.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Jan Stancek <jstancek@redhat.com> Link: https://lore.kernel.org/20250417185757.2194541-1-costa.shul@redhat.com Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-05-07 15:36:19 -04:00
Tomas Glozar	770840a0e7	Documentation/rtla: Include BPF sample collection Add dependencies needed to build rtla with BPF sample collection support to README, and document both ways of sample collection in the manpages. Signed-off-by: Tomas Glozar <tglozar@redhat.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250311114936.148012-5-tglozar@redhat.com	2025-04-14 10:42:55 -06:00
Linus Torvalds	4fa118e5b7	tracing tooling updates for 6.15: - Allow RTLA to collect data via BPF The current implementation of rtla uses libtracefs and libtraceevent to pull sample events generated by the timerlat tracer from the trace buffer. rtla then processes the sample by updating the histogram and summary (current, maximum, minimum, and sum values) as well as checks if tracing has been stopped due to threshold overflow. In use cases where a large number of samples is being generated, that is, with measurements running on many CPUs and with a low interval, this sample processing design causes a significant CPU load on the rtla side. Furthermore, with >100 CPUs and 100us interval, rtla was reported as not being able to keep up with the samples and dropping most of them, leading to it being unusable. Change the way the timerlat trace processes samples by attaching a BPF program to the trace event using the BPF skeleton feature of bpftool. Unlike the current implementation, the BPF implementation does not check whether tracing is stopped (in BPF mode, tracing is always off to improve performance), but waits for a write to a BPF ringbuffer instead. This allows rtla to exit immediately when a threshold is violated, without waiting for the next iteration of the while loop. If the requirements for the BPF implementation are not met, either at build time or at run time, the current implementation is used as fallback. Which implementation is being used can be seen when running rtla timerlat with "-D" option. rtla can be forced to run in non-BPF mode by setting the RTLA_NO_BPF option to 1, for debugging purposes. - Fix LD_FLAGS from being dropped in build - Refactor code to remove duplication of save_trace_to_file - Always set options and do not rely on default settings Do not rely on the default kernel settings of the tracers when starting. They could have been changed by the user which gives inconsistent results. Always set the options that rtla expects. - Add creation of ctags and TAGS for traversing code -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ+WBgRQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qg54AQDCOChaSSBiUkD0VoPKIeDMlPfvO5Qz Xvrst5gtopfKFgEA12/9Lll/sh1eoc4saeGBooNY48HBUMjmX3KNFB14PQg= =aHFu -----END PGP SIGNATURE----- Merge tag 'trace-tools-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing tooling updates from Steven Rostedt: - Allow RTLA to collect data via BPF The current implementation of rtla uses libtracefs and libtraceevent to pull sample events generated by the timerlat tracer from the trace buffer. rtla then processes the sample by updating the histogram and summary (current, maximum, minimum, and sum values) as well as checks if tracing has been stopped due to threshold overflow. In use cases where a large number of samples is being generated, that is, with measurements running on many CPUs and with a low interval, this sample processing design causes a significant CPU load on the rtla side. Furthermore, with >100 CPUs and 100us interval, rtla was reported as not being able to keep up with the samples and dropping most of them, leading to it being unusable. Change the way the timerlat trace processes samples by attaching a BPF program to the trace event using the BPF skeleton feature of bpftool. Unlike the current implementation, the BPF implementation does not check whether tracing is stopped (in BPF mode, tracing is always off to improve performance), but waits for a write to a BPF ringbuffer instead. This allows rtla to exit immediately when a threshold is violated, without waiting for the next iteration of the while loop. If the requirements for the BPF implementation are not met, either at build time or at run time, the current implementation is used as fallback. Which implementation is being used can be seen when running rtla timerlat with "-D" option. rtla can be forced to run in non-BPF mode by setting the RTLA_NO_BPF option to 1, for debugging purposes. - Fix LD_FLAGS from being dropped in build - Refactor code to remove duplication of save_trace_to_file - Always set options and do not rely on default settings Do not rely on the default kernel settings of the tracers when starting. They could have been changed by the user which gives inconsistent results. Always set the options that rtla expects. - Add creation of ctags and TAGS for traversing code * tag 'trace-tools-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rtla: Add the ability to create ctags and etags rtla/tests: Test setting default options rtla/tests: Reset osnoise options before check rtla: Always set all tracer options rtla/osnoise: Set OSNOISE_WORKLOAD to true rtla: Unify apply_config between top and hist rtla/osnoise: Unify params struct rtla: Fix segfault in save_trace_to_file call tools/build: Use SYSTEM_BPFTOOL for system bpftool rtla: Refactor save_trace_to_file tools/rv: Keep user LDFLAGS in build rtla/timerlat: Test BPF mode rtla/timerlat_top: Use BPF to collect samples rtla/timerlat_top: Move divisor to update rtla/timerlat_hist: Use BPF to collect samples rtla/timerlat: Add BPF skeleton to collect samples rtla: Add optional dependency on BPF tooling tools/build: Add bpftool-skeletons feature test rtla/timerlat: Unify params struct	2025-03-27 17:03:01 -07:00
John Kacur	732032692f	rtla: Add the ability to create ctags and etags - Add the ability to create and remove ctags and etags, using the following make tags make TAGS make tags_clean - fix a comment in Makefile.rtla with the correct spelling and don't imply that the ability to create an rtla tarball will be removed Cc: Tomas Glozar <tglozar@redhat.com> Cc: "Luis Claudio R . Goncalves" <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250321175053.29048-1-jkacur@redhat.com Signed-off-by: John Kacur <jkacur@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-26 10:39:40 -04:00
Tomas Glozar	a86150f310	rtla/tests: Test setting default options Add function to test engine to test with pre-set osnoise options, and use it to test whether osnoise period (as an example) is set correctly. The test works by pre-setting a high period of 10 minutes and stop on threshold. Thus, it is easy to check whether rtla is properly resetting the period to default: if it is, the test will complete on time, since the first sample will overflow the threshold. If not, it will time out. Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250320092500.101385-7-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Reviewed-by: John Kacur <jkacur@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-26 10:36:39 -04:00
Tomas Glozar	6c6182728a	rtla/tests: Reset osnoise options before check Remove any dangling tracing instances from previous improperly exited runs of rtla, and reset osnoise options to default before running a test case. This ensures that the test results are deterministic. Specific test cases checked that rtla behaves correctly even when the tracer state is not clean will be added later. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250320092500.101385-6-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-26 10:36:39 -04:00
Tomas Glozar	0122938a7a	rtla: Always set all tracer options rtla currently only sets tracer options that are explicitly set by the user, with the exception of OSNOISE_WORKLOAD. This leads to improper behavior in case rtla is run with those options not set to the default value. rtla does reset them to the original value upon exiting, but that does not protect it from starting with non-default values set either by an improperly exited rtla or by another user of the tracers. For example, after running this command: $ echo 1 > /sys/kernel/tracing/osnoise/stop_tracing_us all runs of rtla will stop at the 1us threshold, even if not requested by the user: $ rtla osnoise hist Index CPU-000 CPU-001 1 8 5 2 5 9 3 1 2 4 6 1 5 2 1 6 0 1 8 1 1 12 0 1 14 1 0 15 1 0 over: 0 0 count: 25 21 min: 1 1 avg: 3.68 3.05 max: 15 12 rtla osnoise hit stop tracing Fix the problem by setting the default value for all tracer options if the user has not provided their own value. For most of the options, it's enough to just drop the if clause checking for the value being set. For cpus, "all" is used as the default value, and for osnoise default period and runtime, default values of the osnoise_data variable in trace_osnoise.c are used. Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250320092500.101385-5-tglozar@redhat.com Fixes: `1eceb2fc2c` ("rtla/osnoise: Add osnoise top mode") Fixes: `829a6c0b56` ("rtla/osnoise: Add the hist mode") Fixes: `a828cd18bc` ("rtla: Add timerlat tool and timelart top mode") Fixes: `1eeb6328e8` ("rtla/timerlat: Add timerlat hist mode") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Reviewed-by: John Kacur <jkacur@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-26 10:36:39 -04:00
Tomas Glozar	a8122a63c9	rtla/osnoise: Set OSNOISE_WORKLOAD to true If running rtla osnoise with NO_OSNOISE_WORKLOAD, it reports no samples: $ echo NO_OSNOISE_WORKLOAD > /sys/kernel/tracing/osnoise/options $ rtla osnoise hist -d 10s Index over: 0 count: 0 min: 0 avg: 0 max: 0 This situation can also happen when running rtla-osnoise after an improperly exited rtla-timerlat run. Set OSNOISE_WORKLOAD in rtla-osnoise, too, similarly to what we already did for timerlat in commit `217f0b1e99` ("rtla/timerlat_top: Set OSNOISE_WORKLOAD for kernel threads") and commit `d8d866171a` ("rtla/timerlat_hist: Set OSNOISE_WORKLOAD for kernel threads"). Note that there is no user workload mode for rtla-osnoise yet, so OSNOISE_WORKLOAD is always set to true. Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250320092500.101385-4-tglozar@redhat.com Fixes: `1eceb2fc2c` ("rtla/osnoise: Add osnoise top mode") Fixes: `829a6c0b56` ("rtla/osnoise: Add the hist mode") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Reviewed-by: John Kacur <jkacur@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-26 10:36:39 -04:00
Tomas Glozar	20d6b07581	rtla: Unify apply_config between top and hist The functions osnoise_top_apply_config and osnoise_hist_apply_config, as well as timerlat_top_apply_config and timerlat_hist_apply_config, are mostly the same. Move common part from them into separate functions osnoise_apply_config and timerlat_apply_config. For rtla-timerlat, also unify params->user_hist and params->user_top into one field called params->user_data, and move several fields used only by timerlat-top into the top-only section of struct timerlat_params. Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250320092500.101385-3-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Reviewed-by: John Kacur <jkacur@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-26 10:36:39 -04:00
Tomas Glozar	025b217990	rtla/osnoise: Unify params struct Instead of having separate structs osnoise_top_params and osnoise_hist_params, use one struct osnoise_params for both. This allows code using the structs to be shared between osnoise-top and osnoise-hist. Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250320092500.101385-2-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Reviewed-by: John Kacur <jkacur@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-26 10:36:39 -04:00
Tomas Glozar	c57c58a62e	rtla: Fix segfault in save_trace_to_file call Running rtla with exit on threshold, but without saving trace leads to a segmenetation fault: $ rtla timerlat hist -T 10 ... Max timerlat IRQ latency from idle: 4.29 us in cpu 0 Segmentation fault This is caused by null pointer deference in the call of save_trace_to_file, which attempts to dereference an uninitialized osnoise_tool variable: save_trace_to_file(record->trace.inst, params->trace_output); ^ this is uninitialized if params->trace_output is not set Fix this by not attempting to dereference "record" if it is NULL and passing NULL instead. As a safety measure, the first field is also checked for NULL inside save_trace_to_file. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Costa Shulyupin <costa.shul@redhat.com> Link: https://lore.kernel.org/20250313141034.299117-1-tglozar@redhat.com Fixes: `dc4d4e7c72` ("rtla: Refactor save_trace_to_file") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-26 10:35:20 -04:00
Tomas Glozar	814d051ebe	tools/build: Use SYSTEM_BPFTOOL for system bpftool The feature test for system bpftool uses BPFTOOL as the variable to set its path, defaulting to just "bpftool" if not set by the user. This conflicts with selftests and a few other utilities, which expect BPFTOOL to be set to the in-tree bpftool path by default. For example, bpftool selftests fail to build: $ make -C tools/testing/selftests/bpf/ make: Entering directory '/home/tglozar/dev/linux/tools/testing/selftests/bpf' make: *** No rule to make target 'bpftool', needed by '/home/tglozar/dev/linux/tools/testing/selftests/bpf/tools/include/vmlinux.h'. Stop. make: Leaving directory '/home/tglozar/dev/linux/tools/testing/selftests/bpf' Fix the problem by renaming the variable used for system bpftool from BPFTOOL to SYSTEM_BPFTOOL, so that the new usage does not conflict with the existing one of BPFTOOL. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250326004018.248357-1-tglozar@redhat.com Fixes: `8a635c3856` ("tools/build: Add bpftool-skeletons feature test") Closes: https://lore.kernel.org/linux-kernel/5df6968a-2e5f-468e-b457-fc201535dd4c@linux.ibm.com/ Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Suggested-by: Quentin Monnet <qmo@kernel.org> Acked-by: Quentin Monnet <qmo@kernel.org> Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-26 10:31:31 -04:00
Costa Shulyupin	dc4d4e7c72	rtla: Refactor save_trace_to_file The functions osnoise_hist_main(), osnoise_top_main(), timerlat_hist_main(), and timerlat_top_main() are lengthy and contain duplicated code. Refactor by consolidating the duplicate lines into the save_trace_to_file() function. Cc: Daniel Bristot de Oliveira <bristot@kernel.org> Cc: John Kacur <jkacur@redhat.com> Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com> Cc: Eder Zulian <ezulian@redhat.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250219115138.406075-1-costa.shul@redhat.com Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Tested-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-04 14:17:19 -05:00
Tomas Glozar	005682b403	rtla/timerlat: Test BPF mode Using the RTLA_NO_BPF environmental variable, execute rtla-timerlat tests both with and without BPF support to cover both paths. If rtla is built without BPF or the osnoise:timerlat_sample trace event is not available, test only the non-BPF path. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Cc: Clark Williams <williams@redhat.com> Link: https://lore.kernel.org/20250218145859.27762-9-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-04 12:35:18 -05:00
Tomas Glozar	9a82a3fd9e	rtla/timerlat_top: Use BPF to collect samples Collect samples using BPF program instead of pulling them from tracefs. If the osnoise:timerlat_sample tracepoint is unavailable or the BPF program fails to load for whatever reason, rtla falls back to the old implementation. The collection of samples using the BPF program is fully self-contained and requires no activity of the userspace part of rtla during the measurement. Thus, rtla only pulls the summary from the BPF map and displays it every second, improving the performance. In --aa-only mode, the BPF program does not collect any data and only signalizes the end of tracing to userspace. An optimization that re-used the main trace instance for auto-analysis in aa-only mode was dropped, as rtla no longer turns tracing on in the main trace instance, making it useless for auto-analysis. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Cc: Clark Williams <williams@redhat.com> Link: https://lore.kernel.org/20250218145859.27762-8-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-04 12:35:18 -05:00
Tomas Glozar	18923806b1	rtla/timerlat_top: Move divisor to update Unlike timerlat-hist, timerlat-top applies the output divisor used to set ns/us mode when printing results instead of applying it when collecting the samples. Move the application of the divisor from timerlat_top_print into timerlat_top_update to make it consistent with timerlat-hist. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Cc: Clark Williams <williams@redhat.com> Link: https://lore.kernel.org/20250218145859.27762-7-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-04 12:35:18 -05:00
Tomas Glozar	fd7925cbb7	rtla/timerlat_hist: Use BPF to collect samples Collect samples using BPF program instead of pulling them from tracefs. If the osnoise:timerlat_sample tracepoint is unavailable or the BPF program fails to load for whatever reason, rtla falls back to the old implementation. The collection of samples using the BPF program is fully self-contained and requires no activity of the userspace part of rtla during the measurement. Thus, instead of waking up every second to collect samples, rtla simply sleeps until woken up by a signal or threshold overflow. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Cc: Clark Williams <williams@redhat.com> Link: https://lore.kernel.org/20250218145859.27762-6-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-04 12:35:18 -05:00
Tomas Glozar	e34293ddce	rtla/timerlat: Add BPF skeleton to collect samples Add BPF program that attaches to the osnoise:timerlat_sample tracepoint and collects both the summary and the histogram (if requested) into BPF maps (one map of each kind per context). The program is designed to be used for both timerlat-top and timerlat-hist. If using with timerlat-top, the "entries" parameter is set to zero, which prevents the BPF program from recording histogram entries. In that case, the maps for histograms do not have to be created, as the BPF verifier will identify the code using them as unreachable. An IRQ or thread latency threshold might be supplied to stop recording if hit, similar to the timerlat tracer threshold, which stops ftrace tracing if hit. A BPF ringbuffer is used to signal threshold overflow to userspace. In aa-only mode, this is the only function of the BPF program. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Cc: Clark Williams <williams@redhat.com> Link: https://lore.kernel.org/20250218145859.27762-5-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-04 12:35:18 -05:00
Tomas Glozar	9dc3766ed0	rtla: Add optional dependency on BPF tooling If tooling required for building BPF CO-RE skeletons is present (that is, libbpf, clang with BPF CO-RE support, and bpftool), turn on HAVE_BPF_SKEL flag. Those requirements are similar to what perf requires, with the difference of using system libbpf and bpftool instead of in-tree versions. rtla can be forcefully built without BPF skeleton support by setting BUILD_BPF_SKEL=0 manually; in that case, a warning is displayed. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Cc: Clark Williams <williams@redhat.com> Link: https://lore.kernel.org/20250218145859.27762-4-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-04 12:35:17 -05:00
Tomas Glozar	6fa5e3a87c	rtla/timerlat: Unify params struct Instead of having separate structs timerlat_top_params and timerlat_hist_params, use one struct timerlat_params for both. This allows code using the structs to be shared between timerlat-top and timerlat-hist. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Cc: Clark Williams <williams@redhat.com> Link: https://lore.kernel.org/20250218145859.27762-2-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-03-04 12:35:17 -05:00
Linus Torvalds	9f5270d758	perf tools fixes for v6.14: 2nd batch - Fix tools/ quiet build Makefile infrastructure that was broken when working on tools/perf/ without testing on other tools/ living utilities. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZ74OJwAKCRCyPKLppCJ+ J30mAPsHCA8A+CNq/5yW2VhFLV1GgCSL5oWqxXRn7QjhSrCQBQEAot2u4O5zXs7M sg+mPlYiS1oT+zmvTLlXrN+bVyWP9A4= =jH1N -----END PGP SIGNATURE----- Merge tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix tools/ quiet build Makefile infrastructure that was broken when working on tools/perf/ without testing on other tools/ living utilities. * tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: tools: Remove redundant quiet setup tools: Unify top-level quiet infrastructure	2025-02-25 13:32:32 -08:00
Charlie Jenkins	42367eca76	tools: Remove redundant quiet setup Q is exported from Makefile.include so it is not necessary to manually set it. Reviewed-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <qmo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Benjamin Tissoires <bentiss@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: KP Singh <kpsingh@kernel.org> Cc: Lukasz Luba <lukasz.luba@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Mykola Lysenko <mykolal@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Yonghong Song <yonghong.song@linux.dev> Cc: Zhang Rui <rui.zhang@intel.com> Link: https://lore.kernel.org/r/20250213-quiet_tools-v3-2-07de4482a581@rivosinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2025-02-18 16:27:43 -03:00
Linus Torvalds	40648d246f	rv: tools/rtla: Updates for 6.14 - Add a test suite to test the tool Add a small test suite that can be used to test rtla's basic features to at least have something to test when applying changes. - Automate manual steps in monitor creation While creating a new monitor in RV, besides generating code from dot2k, there are a few manual steps which can be tedious and error prone, like adding the tracepoints, makefile lines and kconfig, or selecting events that start the monitor in the initial state. Updates were made to try and automate as much as possible among those steps to make creating a new RV monitor much quicker. It is still requires to select proper tracepoints, this step is harder to automate in a general way and, in several cases, would still need user intervention. - Have rtla timerlat hist and top set OSNOISE_WORKLOAD flag Have both rtla-timerlat-hist and rtla-timerlat-top set OSNOISE_WORKLOAD to the proper value ("on" when running with -k, "off" when running with -u) every time the option is available instead of setting it only when running with -u. This prevents rtla timerlat -k from giving no results when NO_OSNOISE_WORKLOAD is set, either manually or by an abnormally exited earlier run of rtla timerlat -u. - Stop rtla timerlat on signal properly when overloaded There is an issue where if rtla is run on machines with a high number of CPUs (100+), timerlat can generate more samples than rtla is able to process via tracefs_iterate_raw_events. This is especially common when the interval is set to 100us (rteval and cyclictest default) as opposed to the rtla default of 1000us, but also happens with the rtla default. Currently, this leads to rtla hanging and having to be terminated with SIGTERM. SIGINT setting stop_tracing is not enough, since more and more events are coming and tracefs_iterate_raw_events never exits. To fix this: Stop the timerlat tracer on SIGINT/SIGALRM to ensure no more events are generated when rtla is supposed to exit. Also on receiving SIGINT/SIGALRM twice, abort iteration immediately with tracefs_iterate_stop, making rtla exit right away instead of waiting for all events to be processed. - Account for missed events Due to tracefs buffer overflow, it can happen that rtla misses events, making the tracing results inaccurate. Count both the number of missed events and the total number of processed events, and display missed events as well as their percentage. The numbers are displayed for both osnoise and timerlat, even though for the earlier, missed events are generally not expected. For hist, the number is displayed at the end of the run; for top, it is displayed on each printing of the top table. - Changes to make osnoise more robust There was a dependency in the code that the first field of the osnoise_tool structure was the trace field. If that that ever changed, then the code work break. Change the code to encapsulate this dependency where the code that uses the structure does not have this dependency. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ5UQ4BQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qktFAQD2px6MyoOVTssB5Iw3aTWGUfTFoDEc bfng5JsBxlVJkQEA+2UUvP8FJlLTOQvVEwJiscX7CCJxl5bYkV6GWuGRxQU= =h//9 -----END PGP SIGNATURE----- Merge tag 'trace-tools-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull rv and tools/rtla updates from Steven Rostedt: - Add a test suite to test the tool Add a small test suite that can be used to test rtla's basic features to at least have something to test when applying changes. - Automate manual steps in monitor creation While creating a new monitor in RV, besides generating code from dot2k, there are a few manual steps which can be tedious and error prone, like adding the tracepoints, makefile lines and kconfig, or selecting events that start the monitor in the initial state. Updates were made to try and automate as much as possible among those steps to make creating a new RV monitor much quicker. It is still requires to select proper tracepoints, this step is harder to automate in a general way and, in several cases, would still need user intervention. - Have rtla timerlat hist and top set OSNOISE_WORKLOAD flag Have both rtla-timerlat-hist and rtla-timerlat-top set OSNOISE_WORKLOAD to the proper value ("on" when running with -k, "off" when running with -u) every time the option is available instead of setting it only when running with -u. This prevents rtla timerlat -k from giving no results when NO_OSNOISE_WORKLOAD is set, either manually or by an abnormally exited earlier run of rtla timerlat -u. - Stop rtla timerlat on signal properly when overloaded There is an issue where if rtla is run on machines with a high number of CPUs (100+), timerlat can generate more samples than rtla is able to process via tracefs_iterate_raw_events. This is especially common when the interval is set to 100us (rteval and cyclictest default) as opposed to the rtla default of 1000us, but also happens with the rtla default. Currently, this leads to rtla hanging and having to be terminated with SIGTERM. SIGINT setting stop_tracing is not enough, since more and more events are coming and tracefs_iterate_raw_events never exits. To fix this: Stop the timerlat tracer on SIGINT/SIGALRM to ensure no more events are generated when rtla is supposed to exit. Also on receiving SIGINT/SIGALRM twice, abort iteration immediately with tracefs_iterate_stop, making rtla exit right away instead of waiting for all events to be processed. - Account for missed events Due to tracefs buffer overflow, it can happen that rtla misses events, making the tracing results inaccurate. Count both the number of missed events and the total number of processed events, and display missed events as well as their percentage. The numbers are displayed for both osnoise and timerlat, even though for the earlier, missed events are generally not expected. For hist, the number is displayed at the end of the run; for top, it is displayed on each printing of the top table. - Changes to make osnoise more robust There was a dependency in the code that the first field of the osnoise_tool structure was the trace field. If that that ever changed, then the code work break. Change the code to encapsulate this dependency where the code that uses the structure does not have this dependency. * tag 'trace-tools-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (22 commits) rtla: Report missed event count rtla: Add function to report missed events rtla: Count all processed events rtla: Count missed trace events tools/rtla: Add osnoise_trace_is_off() rtla/timerlat_top: Set OSNOISE_WORKLOAD for kernel threads rtla/timerlat_hist: Set OSNOISE_WORKLOAD for kernel threads rtla/osnoise: Distinguish missing workload option rtla/timerlat_top: Abort event processing on second signal rtla/timerlat_hist: Abort event processing on second signal rtla/timerlat_top: Stop timerlat tracer on signal rtla/timerlat_hist: Stop timerlat tracer on signal rtla: Add trace_instance_stop tools/rtla: Add basic test suite verification/dot2k: Implement event type detection verification/dot2k: Auto patch current kernel source verification/dot2k: Simplify manual steps in monitor creation rv: Simplify manual steps in monitor creation verification/dot2k: Add support for name and description options verification/dot2k: More robust template variables ...	2025-01-26 14:25:58 -08:00
Tomas Glozar	cf18620111	rtla: Report missed event count Print how many events were missed by trace buffer overflow in the main instance at the end of the run (for hist) or during the run (for top). Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250123142339.990300-5-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Tested-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-24 13:47:07 -05:00
Tomas Glozar	8ccd9d8bb9	rtla: Add function to report missed events Add osnoise_report_missed_events to be used to report the number of missed events either during or after an osnoise or timerlat run. Also, display the percentage of missed events compared to the total number of received events. If an unknown number of missed events was reported during the run, the entire number of missed events is reported as unknown. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250123142339.990300-4-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-24 13:46:43 -05:00
Tomas Glozar	2aee44f721	rtla: Count all processed events Add a field processed_events to struct trace_instance and increment it in collect_registered_events, regardless of whether a handler is registered for the event. The purpose is to calculate the percentage of events that were missed due to tracefs buffer overflow. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250123142339.990300-3-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-24 13:46:24 -05:00
Tomas Glozar	d6fcd28ffe	rtla: Count missed trace events Add function collect_missed_events to trace.c to act as a callback for tracefs_follow_missed_events, summing the number of total missed events into a new field missing_events of struct trace_instance. In case record->missed_events is negative, trace->missed_events is set to UINT64_MAX to signify an unknown number of events was missed. The callback is activated on initialization of the trace instance. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250123142339.990300-2-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-24 13:46:16 -05:00
Costa Shulyupin	b91cfd9f75	tools/rtla: Add osnoise_trace_is_off() All of the users of trace_is_off() passes in &record->trace as the second parameter, where record is a pointer to a struct osnoise_tool. This record could be NULL and there is a hidden dependency that the trace field is the first field to allow &record->trace to work with a NULL record pointer. In order to make this code a bit more robust, as record shouldn't be dereferenced if it is NULL, even if the code does work, create a new function called osnoise_trace_is_off() that takes the pointer to a struct osnoise_tool as its second parameter. This way it can properly test if it is NULL before it dereferences it. The old function trace_is_off() is removed and the function osnoise_trace_is_off() is added into osnoise.c which is what the struct osnoise_tool is associated with. Cc: John Kacur <jkacur@redhat.com> Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com> Cc: Eder Zulian <ezulian@redhat.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Tomas Glozar <tglozar@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250115180055.2136815-1-costa.shul@redhat.com Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-24 13:46:11 -05:00
Tomas Glozar	217f0b1e99	rtla/timerlat_top: Set OSNOISE_WORKLOAD for kernel threads When using rtla timerlat with userspace threads (-u or -U), rtla disables the OSNOISE_WORKLOAD option in /sys/kernel/tracing/osnoise/options. This option is not re-enabled in a subsequent run with kernel-space threads, leading to rtla collecting no results if the previous run exited abnormally: $ rtla timerlat top -u ^\Quit (core dumped) $ rtla timerlat top -k -d 1s Timer Latency 0 00:00:01 \| IRQ Timer Latency (us) \| Thread Timer Latency (us) CPU COUNT \| cur min avg max \| cur min avg max The issue persists until OSNOISE_WORKLOAD is set manually by running: $ echo OSNOISE_WORKLOAD > /sys/kernel/tracing/osnoise/options Set OSNOISE_WORKLOAD when running rtla with kernel-space threads if available to fix the issue. Cc: stable@vger.kernel.org Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250107144823.239782-4-tglozar@redhat.com Fixes: `cdca4f4e5e` ("rtla/timerlat_top: Add timerlat user-space support") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-24 13:46:03 -05:00
Tomas Glozar	d8d866171a	rtla/timerlat_hist: Set OSNOISE_WORKLOAD for kernel threads When using rtla timerlat with userspace threads (-u or -U), rtla disables the OSNOISE_WORKLOAD option in /sys/kernel/tracing/osnoise/options. This option is not re-enabled in a subsequent run with kernel-space threads, leading to rtla collecting no results if the previous run exited abnormally: $ rtla timerlat hist -u ^\Quit (core dumped) $ rtla timerlat hist -k -d 1s Index over: count: min: avg: max: ALL: IRQ Thr Usr count: 0 0 0 min: - - - avg: - - - max: - - - The issue persists until OSNOISE_WORKLOAD is set manually by running: $ echo OSNOISE_WORKLOAD > /sys/kernel/tracing/osnoise/options Set OSNOISE_WORKLOAD when running rtla with kernel-space threads if available to fix the issue. Cc: stable@vger.kernel.org Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250107144823.239782-3-tglozar@redhat.com Fixes: `ed774f7481` ("rtla/timerlat_hist: Add timerlat user-space support") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-24 13:45:55 -05:00
Tomas Glozar	80d3ba1cf5	rtla/osnoise: Distinguish missing workload option osnoise_set_workload returns -1 for both missing OSNOISE_WORKLOAD option and failure in setting the option. Return -1 for missing and -2 for failure to distinguish them. Cc: stable@vger.kernel.org Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250107144823.239782-2-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-24 13:45:45 -05:00
Tomas Glozar	80967b354a	rtla/timerlat_top: Abort event processing on second signal If either SIGINT is received twice, or after a SIGALRM (that is, after timerlat was supposed to stop), abort processing events currently left in the tracefs buffer and exit immediately. This allows the user to exit rtla without waiting for processing all events, should that take longer than wanted, at the cost of not processing all samples. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250116144931.649593-6-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-24 13:45:37 -05:00
Tomas Glozar	d6899e5603	rtla/timerlat_hist: Abort event processing on second signal If either SIGINT is received twice, or after a SIGALRM (that is, after timerlat was supposed to stop), abort processing events currently left in the tracefs buffer and exit immediately. This allows the user to exit rtla without waiting for processing all events, should that take longer than wanted, at the cost of not processing all samples. Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250116144931.649593-5-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-24 13:45:31 -05:00
Tomas Glozar	a4dfce7559	rtla/timerlat_top: Stop timerlat tracer on signal Currently, when either SIGINT from the user or SIGALRM from the duration timer is caught by rtla-timerlat, stop_tracing is set to break out of the main loop. This is not sufficient for cases where the timerlat tracer is producing more data than rtla can consume, since in that case, rtla is looping indefinitely inside tracefs_iterate_raw_events, never reaches the check of stop_tracing and hangs. In addition to setting stop_tracing, also stop the timerlat tracer on received signal (SIGINT or SIGALRM). This will stop new samples so that the existing samples may be processed and tracefs_iterate_raw_events eventually exits. Cc: stable@vger.kernel.org Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250116144931.649593-4-tglozar@redhat.com Fixes: `a828cd18bc` ("rtla: Add timerlat tool and timelart top mode") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-24 13:45:23 -05:00
Tomas Glozar	c73cab9dbe	rtla/timerlat_hist: Stop timerlat tracer on signal Currently, when either SIGINT from the user or SIGALRM from the duration timer is caught by rtla-timerlat, stop_tracing is set to break out of the main loop. This is not sufficient for cases where the timerlat tracer is producing more data than rtla can consume, since in that case, rtla is looping indefinitely inside tracefs_iterate_raw_events, never reaches the check of stop_tracing and hangs. In addition to setting stop_tracing, also stop the timerlat tracer on received signal (SIGINT or SIGALRM). This will stop new samples so that the existing samples may be processed and tracefs_iterate_raw_events eventually exits. Cc: stable@vger.kernel.org Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250116144931.649593-3-tglozar@redhat.com Fixes: `1eeb6328e8` ("rtla/timerlat: Add timerlat hist mode") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-24 13:45:14 -05:00
Tomas Glozar	e879b5dcf8	rtla: Add trace_instance_stop Support not only turning trace on for the timerlat tracer, but also turning it off. This will be used in subsequent patches to stop the timerlat tracer without also wiping the trace buffer. Cc: stable@vger.kernel.org Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250116144931.649593-2-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-24 13:45:02 -05:00
Tomas Glozar	ab16714fcb	tools/rtla: Add basic test suite Implement a simple TAP-based test engine in bash and a few basic tests using it, to be used to check for bugs and regressions. A new "check" target is added to the rtla Makefile that runs the test suite using the "prove" command implemented by Test::Harness. The only test format currently supported is running rtla with defined command arguments per test, checking its exit code. In case the exit code is non-zero, the output of rtla is displayed, together with the exit code. The test cases are adopted from rtla tests in the Continuous Kernel Integration (CKI) project [1] with the authors' approval. [1] https://gitlab.com/redhat/centos-stream/tests/kernel/kernel-tests/-/blob/main/rt-tests/us/rtla/ Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Chang Yin <cyin@redhat.com> Cc: Qiao Zhao <qzhao@redhat.com> Link: https://lore.kernel.org/20250120135630.802111-1-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-23 12:21:45 -05:00

1 2 3 4

173 commits