linux/tools/perf/util
Kan Liang ff165628d7 perf callchain: Stitch LBR call stack
In LBR call stack mode, the depth of reconstructed LBR call stack limits
to the number of LBR registers.

  For example, on skylake, the depth of reconstructed LBR call stack is
  always <= 32.

  # To display the perf.data header info, please use
  # --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 6K of event 'cycles'
  # Event count (approx.): 6487119731
  #
  # Children      Self  Command          Shared Object       Symbol
  # ........  ........  ...............  ..................
  # ................................

    99.97%    99.97%  tchain_edit      tchain_edit        [.] f43
            |
             --99.64%--f11
                       f12
                       f13
                       f14
                       f15
                       f16
                       f17
                       f18
                       f19
                       f20
                       f21
                       f22
                       f23
                       f24
                       f25
                       f26
                       f27
                       f28
                       f29
                       f30
                       f31
                       f32
                       f33
                       f34
                       f35
                       f36
                       f37
                       f38
                       f39
                       f40
                       f41
                       f42
                       f43

For a call stack which is deeper than LBR limit, HW will overwrite the
LBR register with oldest branch. Only partial call stacks can be
reconstructed.

However, the overwritten LBRs may still be retrieved from previous
sample. At that moment, HW hasn't overwritten the LBR registers yet.
Perf tools can stitch those overwritten LBRs on current call stacks to
get a more complete call stack.

To determine if LBRs can be stitched, perf tools need to compare current
sample with previous sample.

- They should have identical LBR records (Same from, to and flags
  values, and the same physical index of LBR registers).

- The searching starts from the base-of-stack of current sample.

Once perf determines to stitch the previous LBRs, the corresponding LBR
cursor nodes will be copied to 'lists'.  The 'lists' is to track the LBR
cursor nodes which are going to be stitched.

When the stitching is over, the nodes will not be freed immediately.
They will be moved to 'free_lists'. Next stitching may reuse the space.
Both 'lists' and 'free_lists' will be freed when all samples are
processed.

Committer notes:

Fix the intel-pt.c initialization of the union with 'struct
branch_flags', that breaks the build with its unnamed union on older gcc
versions.

Uninline thread__free_stitch_list(), as it grew big and started dragging
includes to thread.h, so move it to thread.c where what it needs in
terms of headers are already there.

This fixes the build in several systems such as debian:experimental when
cross building to the MIPS32 architecture, i.e. in the other cases what
was needed was being included by sheer luck.

  In file included from builtin-sched.c:11:
  util/thread.h: In function 'thread__free_stitch_list':
  util/thread.h:169:3: error: implicit declaration of function 'free' [-Werror=implicit-function-declaration]
    169 |   free(pos);
        |   ^~~~
  util/thread.h:169:3: error: incompatible implicit declaration of built-in function 'free' [-Werror]
  util/thread.h:19:1: note: include '<stdlib.h>' or provide a declaration of 'free'
     18 | #include "callchain.h"
    +++ |+#include <stdlib.h>
     19 |
  util/thread.h:174:3: error: incompatible implicit declaration of built-in function 'free' [-Werror]
    174 |   free(pos);
        |   ^~~~
  util/thread.h:174:3: note: include '<stdlib.h>' or provide a declaration of 'free'

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
Link: http://lore.kernel.org/lkml/20200319202517.23423-13-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-18 09:05:01 -03:00
..
c++ perf clang: Fix build with Clang 9 2020-01-14 12:02:19 -03:00
cs-etm-decoder
include perf bench: Update the copies of x86's mem{cpy,set}_64.S 2019-12-02 11:40:57 -03:00
intel-pt-decoder
libunwind
scripting-engines perf script report: Fix SEGFAULT when using DWARF mode 2020-04-03 09:39:53 -03:00
affinity.c perf affinity: Add infrastructure to save/restore affinity 2019-11-28 08:08:38 -03:00
affinity.h perf affinity: Add infrastructure to save/restore affinity 2019-11-28 08:08:38 -03:00
annotate.c perf annotate: Add basic support for bpf_image 2020-04-16 12:19:06 -03:00
annotate.h perf report: Support interactive annotation of code without symbols 2020-03-24 09:36:33 -03:00
archinsn.h
arm-spe-pkt-decoder.c
arm-spe-pkt-decoder.h
arm-spe.c perf arm-spe: Implement ->evsel_is_auxtrace() callback 2020-04-16 12:19:15 -03:00
arm-spe.h
auxtrace.c perf evsel: Move and globalize perf_evsel__find_pmu() and perf_evsel__is_aux_event() 2020-04-18 09:04:32 -03:00
auxtrace.h perf auxtrace: Add an option to synthesize callchains for regular events 2020-04-16 12:19:15 -03:00
block-info.c perf block-info: Support color ops to print block percents in color 2020-03-09 21:43:25 -03:00
block-info.h perf block-info: Allow selecting which columns to report and its order 2020-03-09 21:43:25 -03:00
block-range.c
block-range.h
bpf-event.c perf tools: Synthesize bpf_trampoline/dispatcher ksymbol event 2020-04-16 12:19:06 -03:00
bpf-event.h
bpf-loader.c
bpf-loader.h
bpf-prologue.c
bpf-prologue.h
bpf_map.c
bpf_map.h
branch.c
branch.h perf callchain: Stitch LBR call stack 2020-04-18 09:05:01 -03:00
Build perf expr: Move expr lexer to flex 2020-03-09 21:43:24 -03:00
build-id.c
build-id.h
cache.h
cacheline.c
cacheline.h
call-path.c
call-path.h
callchain.c perf map_symbol: Rename ms->mg to ms->maps 2019-11-26 11:07:46 -03:00
callchain.h perf callchain: Stitch LBR call stack 2020-04-18 09:05:01 -03:00
cap.c
cap.h perf tools: Support CAP_PERFMON capability 2020-04-16 12:19:08 -03:00
cgroup.c perf cgroup: Maintain cgroup hierarchy 2020-04-03 09:37:55 -03:00
cgroup.h perf cgroup: Maintain cgroup hierarchy 2020-04-03 09:37:55 -03:00
cloexec.c
cloexec.h
color.c
color.h
color_config.c
comm.c
comm.h
compress.h
config.c perf config: Introduce perf_config_u8() 2020-02-27 10:44:54 -03:00
config.h perf config: Introduce perf_config_u8() 2020-02-27 10:44:54 -03:00
copyfile.c
copyfile.h
counts.c
counts.h
cpu-set-sched.h
cpumap.c perf cpumap: Fix snprintf overflow check 2020-03-24 10:36:00 -03:00
cpumap.h perf evsel: Add iterator to iterate over events ordered by CPU 2019-11-29 12:20:45 -03:00
cputopo.c
cputopo.h
cs-etm.c perf cs-etm: Implement ->evsel_is_auxtrace() callback 2020-04-16 12:19:15 -03:00
cs-etm.h
data-convert-bt.c
data-convert-bt.h
data-convert.h
data.c
data.h
db-export.c perf addr_location: Rename al->mg to al->maps 2019-11-26 11:07:46 -03:00
db-export.h
debug.c perf tool: Provide an option to print perf_event_open args and return value 2019-11-12 08:32:27 -03:00
debug.h perf tool: Provide an option to print perf_event_open args and return value 2019-11-12 08:32:27 -03:00
demangle-java.c
demangle-java.h
demangle-rust.c
demangle-rust.h
dso.c perf annotate: Add basic support for bpf_image 2020-04-16 12:19:06 -03:00
dso.h perf annotate: Add basic support for bpf_image 2020-04-16 12:19:06 -03:00
dsos.c perf dso: Fix dso comparison 2020-03-24 10:57:38 -03:00
dsos.h perf dso: Move dso_id from 'struct map' to 'struct dso' 2019-11-19 19:12:26 -03:00
dump-insn.c
dump-insn.h
dwarf-aux.c perf probe: Show correct statement line number by perf probe -l 2019-11-18 18:56:27 -03:00
dwarf-aux.h
dwarf-regs.c
env.c perf cgroup: Maintain cgroup hierarchy 2020-04-03 09:37:55 -03:00
env.h perf header: Support CPU PMU capabilities 2020-04-18 09:05:00 -03:00
event.c perf script: Allow --symbol to accept hexadecimal addresses 2020-04-03 09:37:56 -03:00
event.h perf tools: Basic support for CGROUP event 2020-04-03 09:37:55 -03:00
events_stats.h
evlist.c perf evlist: Allow multiple read formats 2020-04-18 09:05:00 -03:00
evlist.h perf stat: Use affinity for opening events 2019-11-29 12:20:45 -03:00
evsel.c perf stat: Force error in fallback on :k events 2020-04-18 09:05:00 -03:00
evsel.h perf evsel: Move and globalize perf_evsel__find_pmu() and perf_evsel__is_aux_event() 2020-04-18 09:04:32 -03:00
evsel_config.h perf parse: Copy string to perf_evsel_config_term 2020-01-30 11:55:02 +01:00
evsel_fprintf.c perf callchain: Use 'struct map_symbol' in 'struct callchain_cursor_node' 2019-11-12 08:20:53 -03:00
evsel_fprintf.h
evswitch.c
evswitch.h
expr.c perf expr: Add expr_scanner_ctx object 2020-04-16 12:19:13 -03:00
expr.h perf expr: Add expr_scanner_ctx object 2020-04-16 12:19:13 -03:00
expr.l perf expr: Add expr_scanner_ctx object 2020-04-16 12:19:13 -03:00
expr.y perf expr: Add expr_ prefix for parse_ctx and parse_id 2020-04-16 12:19:13 -03:00
find-map.c
fncache.c perf pmu: Use file system cache to optimize sysfs access 2019-11-28 08:08:38 -03:00
fncache.h perf pmu: Use file system cache to optimize sysfs access 2019-11-28 08:08:38 -03:00
genelf.c perf jit: Move test functionality in to a test 2019-11-29 12:20:45 -03:00
genelf.h
genelf_debug.c
generate-cmdlist.sh
get_current_dir_name.c
get_current_dir_name.h
group.h
header.c perf header: Support CPU PMU capabilities 2020-04-18 09:05:00 -03:00
header.h perf header: Support CPU PMU capabilities 2020-04-18 09:05:00 -03:00
help-unknown-cmd.c
help-unknown-cmd.h
hist.c perf report: Add 'cgroup' sort key 2020-04-03 09:37:55 -03:00
hist.h perf report: Add 'cgroup' sort key 2020-04-03 09:37:55 -03:00
intel-bts.c perf intel-bts: Implement ->evsel_is_auxtrace() callback 2020-04-16 12:19:15 -03:00
intel-bts.h
intel-pt.c perf callchain: Stitch LBR call stack 2020-04-18 09:05:01 -03:00
intel-pt.h
intlist.c
intlist.h
jit.h
jitdump.c
jitdump.h
kvm-stat.h
levenshtein.c
levenshtein.h
llvm-utils.c perf llvm: Add debug hint message about missing kernel-devel package 2020-03-04 10:34:10 -03:00
llvm-utils.h
lzma.c
machine.c perf callchain: Stitch LBR call stack 2020-04-18 09:05:01 -03:00
machine.h perf tools: Basic support for CGROUP event 2020-04-03 09:37:55 -03:00
map.c perf map: Use strstarts() to look for Android libraries 2020-03-11 10:48:44 -03:00
map.h perf maps: Merge 'struct maps' with 'struct map_groups' 2019-11-26 11:07:46 -03:00
map_symbol.h perf map_symbol: Rename ms->mg to ms->maps 2019-11-26 11:07:46 -03:00
maps.h perf maps: Rename map_groups.h to maps.h 2019-11-26 11:07:46 -03:00
mem-events.c pref tools: Make 'struct addr_map_symbol' contain 'struct map_symbol' 2019-11-12 08:20:53 -03:00
mem-events.h
mem2node.c
mem2node.h
memswap.c
memswap.h
metricgroup.c perf metrictroup: Split the metricgroup__add_metric function 2020-04-16 12:19:13 -03:00
metricgroup.h
mmap.c perf record: Fix binding of AIO user space buffers to nodes 2020-03-12 11:32:46 -03:00
mmap.h perf record: Adapt affinity to machines with #CPUs > 1K 2020-01-06 11:46:09 -03:00
namespaces.c
namespaces.h
ordered-events.c
ordered-events.h
parse-branch-options.c
parse-branch-options.h
parse-events.c perf parse-events: Fix 3 use after frees found with clang ASAN 2020-03-23 11:08:29 -03:00
parse-events.h perf record: Add aux-sample-size config term 2019-11-22 10:48:13 -03:00
parse-events.l perf parser: Add support to specify rXXX event with pmu 2020-04-18 09:05:00 -03:00
parse-events.y perf parser: Add support to specify rXXX event with pmu 2020-04-18 09:05:00 -03:00
parse-regs-options.c
parse-regs-options.h
path.c
path.h
perf-hooks-list.h
perf-hooks.c
perf-hooks.h
PERF-VERSION-GEN
perf_event_attr_fprintf.c perf tools: Basic support for CGROUP event 2020-04-03 09:37:55 -03:00
perf_regs.c
perf_regs.h perf regs: Make perf_reg_name() return "unknown" instead of NULL 2019-11-28 08:08:38 -03:00
pmu.c perf pmu: Add support for PMU capabilities 2020-04-18 09:05:00 -03:00
pmu.h perf pmu: Add support for PMU capabilities 2020-04-18 09:05:00 -03:00
pmu.l
pmu.y
print_binary.c
print_binary.h
probe-event.c perf maps: Rename map_groups.h to maps.h 2019-11-26 11:07:46 -03:00
probe-event.h perf probe: Trace a magic number if variable is not found 2019-11-18 19:09:23 -03:00
probe-file.c perf probe: Fix to delete multiple probe event 2020-03-09 10:41:14 -03:00
probe-file.h perf probe: Support DW_AT_const_value constant value 2019-11-18 19:08:02 -03:00
probe-finder.c perf probe: Do not depend on dwfl_module_addrsym() 2020-03-09 10:43:53 -03:00
probe-finder.h perf probe: Trace a magic number if variable is not found 2019-11-18 19:09:23 -03:00
pstack.c
pstack.h
python-ext-sources perf python: Include rwsem.c in the pythong biding 2020-04-03 09:37:55 -03:00
python.c perf tool: Provide an option to print perf_event_open args and return value 2019-11-12 08:32:27 -03:00
rb_resort.h
rblist.c
rblist.h
record.c perf tools: Add support for leader-sampling with AUX area events 2020-04-18 09:05:00 -03:00
record.h perf record: Add --all-cgroups option 2020-04-03 09:37:55 -03:00
rlimit.c
rlimit.h
rwsem.c
rwsem.h
s390-cpumcf-kernel.h perf s390-cpumsf: Implement ->evsel_is_auxtrace() callback 2020-04-16 12:19:15 -03:00
s390-cpumsf-kernel.h
s390-cpumsf.c perf auxtrace: Add an option to synthesize callchains for regular events 2020-04-16 12:19:15 -03:00
s390-cpumsf.h
s390-sample-raw.c
sample-raw.c
sample-raw.h
session.c perf tools: Basic support for CGROUP event 2020-04-03 09:37:55 -03:00
session.h perf session: Add facility to peek at all events 2019-11-22 10:48:13 -03:00
setns.c
setup.py perf python: Check if clang supports -fno-semantic-interposition 2020-04-14 08:43:18 -03:00
smt.c
smt.h
sort.c perf report: Add 'cgroup' sort key 2020-04-03 09:37:55 -03:00
sort.h perf report: Add 'cgroup' sort key 2020-04-03 09:37:55 -03:00
spark.c
spark.h
srccode.c perf pmu: Use file system cache to optimize sysfs access 2019-11-28 08:08:38 -03:00
srccode.h
srcline.c perf: Make perf able to build with latest libbfd 2020-01-30 11:55:26 +01:00
srcline.h
stat-display.c perf stat: Align the output for interval aggregation mode 2020-03-24 09:37:27 -03:00
stat-shadow.c perf expr: Add expr_ prefix for parse_ctx and parse_id 2020-04-16 12:19:13 -03:00
stat.c perf stat: Use affinity for opening events 2019-11-29 12:20:45 -03:00
stat.h perf stat: Show percore counts in per CPU output 2020-03-04 10:34:09 -03:00
strbuf.c
strbuf.h
strfilter.c
strfilter.h
string.c
string2.h
strlist.c
strlist.h
svghelper.c
svghelper.h
symbol-elf.c perf symbols: Consolidate symbol fixup issue 2020-03-23 11:08:29 -03:00
symbol-minimal.c
symbol.c perf annotate: Add basic support for bpf_image 2020-04-16 12:19:06 -03:00
symbol.h perf maps: Rename 'mg' variables to 'maps' 2019-11-26 11:07:46 -03:00
symbol_conf.h perf report: Allow specifying event to be used as sort key in --group output 2020-03-24 09:37:27 -03:00
symbol_fprintf.c
symsrc.h
synthetic-events.c perf synthetic-events: save 4kb from 2 stack frames 2020-04-16 12:19:13 -03:00
synthetic-events.h perf record: Support synthesizing cgroup events 2020-04-03 09:37:55 -03:00
syscalltbl.c
syscalltbl.h
target.c
target.h
term.c
term.h
thread-stack.c perf thread-stack: Add thread_stack__sample_late() 2020-04-16 12:19:15 -03:00
thread-stack.h perf thread-stack: Add thread_stack__sample_late() 2020-04-16 12:19:15 -03:00
thread.c perf callchain: Stitch LBR call stack 2020-04-18 09:05:01 -03:00
thread.h perf callchain: Stitch LBR call stack 2020-04-18 09:05:01 -03:00
thread_map.c
thread_map.h
time-utils.c
time-utils.h
tool.h perf record: Support synthesizing cgroup events 2020-04-03 09:37:55 -03:00
top.c
top.h
trace-event-info.c
trace-event-parse.c
trace-event-read.c
trace-event-scripting.c
trace-event.c
trace-event.h
trigger.h
tsc.c
tsc.h
units.c
units.h
unwind-libdw.c perf map_symbol: Rename ms->mg to ms->maps 2019-11-26 11:07:46 -03:00
unwind-libdw.h
unwind-libunwind-local.c perf maps: Rename 'mg' variables to 'maps' 2019-11-26 11:07:46 -03:00
unwind-libunwind.c perf maps: Rename 'mg' variables to 'maps' 2019-11-26 11:07:46 -03:00
unwind.h perf maps: Merge 'struct maps' with 'struct map_groups' 2019-11-26 11:07:46 -03:00
usage.c
util.c perf tools: Support CAP_PERFMON capability 2020-04-16 12:19:08 -03:00
util.h perf util: Factor out sysctl__nmi_watchdog_enabled() 2020-03-10 14:46:19 -03:00
values.c
values.h
vdso.c perf thread: Rename thread->mg to thread->maps 2019-11-26 11:07:46 -03:00
vdso.h
xyarray.c
zlib.c
zstd.c