Commit graph

23 commits

Author SHA1 Message Date
Ian Rogers
8838abf626 perf build: Rename HAVE_DWARF_SUPPORT to HAVE_LIBDW_SUPPORT
In Makefile.config for unwinding the name dwarf implies either
libunwind or libdw. Make it clearer that HAVE_DWARF_SUPPORT is really
just defined when libdw is present by renaming to HAVE_LIBDW_SUPPORT.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241017001354.56973-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-18 10:17:40 -07:00
Namhyung Kim
1cfd01eb60 perf annotate-data: Copy back variable types after move
In some cases, compilers don't set the location expression in DWARF
precisely.  For instance, it may assign a variable to a register after
copying it from a different register.  Then it should use the register
for the new type but still uses the old register.  This makes hard to
track the type information properly.

This is an example I found in __tcp_transmit_skb().  The first argument
(sk) of this function is a pointer to sock and there's a variable (tp)
for tcp_sock.

  static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
  				int clone_it, gfp_t gfp_mask, u32 rcv_nxt)
  {
  	...
  	struct tcp_sock *tp;

  	BUG_ON(!skb || !tcp_skb_pcount(skb));
  	tp = tcp_sk(sk);
  	prior_wstamp = tp->tcp_wstamp_ns;
  	tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache);
  	...

So it basically calls tcp_sk(sk) to get the tcp_sock pointer from sk.
But it turned out to be the same value because tcp_sock embeds sock as
the first member.  The sk is located in reg5 (RDI) and tp is in reg3
(RBX).  The offset of tcp_wstamp_ns is 0x748 and tcp_clock_cache is
0x750.  So you need to use RBX (reg3) to access the fields in the
tcp_sock.  But the code used RDI (reg5) as it has the same value.

  $ pahole --hex -C tcp_sock vmlinux | grep -e 748 -e 750
	u64                tcp_wstamp_ns;        /* 0x748   0x8 */
	u64                tcp_clock_cache;      /* 0x750   0x8 */

And this is the disassembly of the part of the function.

  <__tcp_transmit_skb>:
  ...
  44:  mov    %rdi, %rbx
  47:  mov    0x748(%rdi), %rsi
  4e:  mov    0x750(%rdi), %rax
  55:  cmp    %rax, %rsi

Because compiler put the debug info to RBX, it only knows RDI is a
pointer to sock and accessing those two fields resulted in error
due to offset being beyond the type size.

  -----------------------------------------------------------
  find data type for 0x748(reg5) at __tcp_transmit_skb+0x63
  CU for net/ipv4/tcp_output.c (die:0x817f543)
  frame base: cfa=0 fbreg=6
  scope: [1/1] (die:81aac3e)
  bb: [0 - 30]
  var [0] -0x98(stack) type='struct tcp_out_options' size=0x28 (die:0x81af3df)
  var [5] reg8 type='unsigned int' size=0x4 (die:0x8180ed6)
  var [5] reg2 type='unsigned int' size=0x4 (die:0x8180ed6)
  var [5] reg1 type='int' size=0x4 (die:0x818059e)
  var [5] reg4 type='struct sk_buff*' size=0x8 (die:0x8181360)
  var [5] reg5 type='struct sock*' size=0x8 (die:0x8181a0c)                   <<<--- the first argument ('sk' at %RDI)
  mov [19] reg8 -> -0xa8(stack) type='unsigned int' size=0x4 (die:0x8180ed6)
  mov [20] stack canary -> reg0
  mov [29] reg0 -> -0x30(stack) stack canary
  bb: [36 - 3e]
  mov [36] reg4 -> reg15 type='struct sk_buff*' size=0x8 (die:0x8181360)
  bb: [44 - 63]
  mov [44] reg5 -> reg3 type='struct sock*' size=0x8 (die:0x8181a0c)          <<<--- calling tcp_sk()
  var [47] reg3 type='struct tcp_sock*' size=0x8 (die:0x819eead)              <<<--- new variable ('tp' at %RBX)
  var [4e] reg4 type='unsigned long long' size=0x8 (die:0x8180edd)
  mov [58] reg4 -> -0xc0(stack) type='unsigned long long' size=0x8 (die:0x8180edd)
  chk [63] reg5 offset=0x748 ok=1 kind=1 (struct sock*) : offset bigger than size    <<<--- access with old variable
  final result: offset bigger than size

While it's a fault in the compiler, we could work around this issue by
using the type of new variable when it's copied directly.  So I've added
copied_from field in the register state to track those direct register
to register copies.  After that new register gets a new type and the old
register still has the same type, it'll update (copy it back) the type
of the old register.

For example, if we can update type of reg5 at __tcp_transmit_skb+0x47,
we can find the target type of the instruction at 0x63 like below:

  -----------------------------------------------------------
  find data type for 0x748(reg5) at __tcp_transmit_skb+0x63
  ...
  bb: [44 - 63]
  mov [44] reg5 -> reg3 type='struct sock*' size=0x8 (die:0x8181a0c)
  var [47] reg3 type='struct tcp_sock*' size=0x8 (die:0x819eead)
  var [47] copyback reg5 type='struct tcp_sock*' size=0x8 (die:0x819eead)     <<<--- here
  mov [47] 0x748(reg5) -> reg4 type='unsigned long long' size=0x8 (die:0x8180edd)
  mov [4e] 0x750(reg5) -> reg0 type='unsigned long long' size=0x8 (die:0x8180edd)
  mov [58] reg4 -> -0xc0(stack) type='unsigned long long' size=0x8 (die:0x8180edd)
  chk [63] reg5 offset=0x748 ok=1 kind=1 (struct tcp_sock*) : Good!           <<<--- new type
  found by insn track: 0x748(reg5) type-offset=0x748
  final result:  type='struct tcp_sock' size=0xa98 (die:0x819eeb2)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240821232628.353177-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-22 12:38:18 -03:00
Namhyung Kim
037f1b67e8 perf annotate: Cache debuginfo for data type profiling
In find_data_type(), it creates and deletes a debug info whenver it
tries to find data type for a sample.  This is inefficient and it most
likely accesses the same binary again and again.

Let's add a single entry cache the debug info structure for the last DSO.
Depending on sample data, it usually gives me 2~3x (and sometimes more)
speed ups.

Note that this will introduce a little difference in the output due to
the order of checking stack operations.  It used to check the stack ops
before checking the availability of debug info but I moved it after the
symbol check.  So it'll report stack operations in DSOs without debug
info as unknown.  But I think it's ok and better to have the checking
near the caching logic.

Committer testing:

  root@x1:~# perf mem record -a sleep 5s
  root@x1:~# perf evlist
  cpu_atom/mem-loads,ldlat=30/P
  cpu_atom/mem-stores/P
  dummy:u
  root@x1:~# diff -u before after
  --- before	2024-08-08 09:33:53.880780784 -0300
  +++ after	2024-08-08 09:35:13.917325041 -0300
  @@ -81,8 +81,8 @@
   # Overhead  Data Type
   # ........  .........
   #
  -    55.43%  (unknown)
  -    11.61%  (stack operation)
  +    55.56%  (unknown)
  +    11.48%  (stack operation)
        4.93%  struct pcpu_hot
        3.26%  unsigned int
        2.48%  struct

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240805234648.1453689-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-08 09:34:43 -03:00
Athira Rajeev
b1d8d968a7 perf annotate: Update TYPE_STATE_MAX_REGS to include max of regs in powerpc
TYPE_STATE_MAX_REGS is arch-dependent. Currently this is defined to be
16.

While checking if reg is valid using has_reg_type, max value is checked
using TYPE_STATE_MAX_REGS value.

Define this conditionally for powerpc.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-4-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Athira Rajeev
782959ac24 perf annotate: Add "update_insn_state" callback function to handle arch specific instruction tracking
Add "update_insn_state" callback to "struct arch" to handle instruction
tracking. Currently updating instruction state is handled by static
function "update_insn_state_x86" which is defined in "annotate-data.c".

Make this as a callback for specific arch and move to archs specific
file "arch/x86/annotate/instructions.c" . This will help to add helper
function for other platforms in file:
"arch/<platform>/annotate/instructions.c" and make changes/updates
easier.

Define callback "update_insn_state" as part of "struct arch", also make
some of the debug functions non-static so that it can be referenced from
other places.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-3-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Athira Rajeev
1d303deedb perf annotate: Move the data structures related to register type to header file
Data type profiling uses instruction tracking by checking each
instruction and updating the register type state in some data
structures.

This is useful to find the data type in cases when the register state
gets transferred from one reg to another.

Example, in x86, "mov" instruction and in powerpc, "mr" instruction.

Currently these structures are defined in annotate-data.c and
instruction tracking is implemented only for x86.

Move these data structures to "annotate-data.h" header file so that
other arch implementations can use it in arch specific files as well.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Namhyung Kim
d001c7a7f4 perf annotate-data: Add hist_entry__annotate_data_tui()
Support data type profiling output on TUI.

Testing from Arnaldo:

First make sure that the debug information for your workload binaries
in embedded in them by building it with '-g' or install the debuginfo
packages, since our workload is 'find':

  root@number:~# type find
  find is hashed (/usr/bin/find)
  root@number:~# rpm -qf /usr/bin/find
  findutils-4.9.0-5.fc39.x86_64
  root@number:~# dnf debuginfo-install findutils
  <SNIP>
  root@number:~#

Then collect some data:

  root@number:~# echo 1 > /proc/sys/vm/drop_caches
  root@number:~# perf mem record find / > /dev/null
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.331 MB perf.data (3982 samples) ]
  root@number:~#

Finally do data-type annotation with the following command, that will
default, as 'perf report' to the --tui mode, with lines colored to
highlight the hotspots, etc.

  root@number:~# perf annotate --data-type
  Annotate type: 'struct predicate' (58 samples)
      Percent     Offset       Size  Field
       100.00          0        312  struct predicate {
         0.00          0          8      PRED_FUNC        pred_func;
         0.00          8          8      char*    p_name;
         0.00         16          4      enum predicate_type      p_type;
         0.00         20          4      enum predicate_precedence        p_prec;
         0.00         24          1      _Bool    side_effects;
         0.00         25          1      _Bool    no_default_print;
         0.00         26          1      _Bool    need_stat;
         0.00         27          1      _Bool    need_type;
         0.00         28          1      _Bool    need_inum;
         0.00         32          4      enum EvaluationCost      p_cost;
         0.00         36          4      float    est_success_rate;
         0.00         40          1      _Bool    literal_control_chars;
         0.00         41          1      _Bool    artificial;
         0.00         48          8      char*    arg_text;
  <SNIP>

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240411033256.2099646-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12 12:02:06 -03:00
Namhyung Kim
9b561be15f perf annotate-data: Add hist_entry__annotate_data_tty()
And move the related code into util/annotate-data.c file.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240411033256.2099646-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12 12:02:06 -03:00
Namhyung Kim
55ee3d005d perf annotate-data: Add a cache for global variable types
They are often searched by many different places.  Let's add a cache
for them to reduce the duplicate DWARF access.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240319055115.4063940-23-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:29 -03:00
Namhyung Kim
b3c95109c1 perf annotate-data: Add stack canary type
When the stack protector is enabled, compiler would generate code to
check stack overflow with a special value called 'stack carary' at
runtime.  On x86_64, GCC hard-codes the stack canary as %gs:40.

While there's a definition of fixed_percpu_data in asm/processor.h,
it seems that the header is not included everywhere and many places
it cannot find the type info.  As it's in the well-known location (at
%gs:40), let's add a pseudo stack canary type to handle it specially.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240319055115.4063940-22-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:29 -03:00
Namhyung Kim
eb8a55e01d perf annotate-data: Implement instruction tracking
If it failed to find a variable for the location directly, it might be
due to a missing variable in the source code.  For example, accessing
pointer variables in a chain can result in the case like below:

  struct foo *foo = ...;

  int i = foo->bar->baz;

The DWARF debug information is created for each variable so it'd have
one for 'foo'.  But there's no variable for 'foo->bar' and then it
cannot know the type of 'bar' and 'baz'.

The above source code can be compiled to the follow x86 instructions:

  mov  0x8(%rax), %rcx
  mov  0x4(%rcx), %rdx   <=== PMU sample
  mov  %rdx, -4(%rbp)

Let's say 'foo' is located in the %rax and it has a pointer to struct
foo.  But perf sample is captured in the second instruction and there
is no variable or type info for the %rcx.

It'd be great if compiler could generate debug info for %rcx, but we
should handle it on our side.  So this patch implements the logic to
iterate instructions and update the type table for each location.

As it already collected a list of scopes including the target
instruction, we can use it to construct the type table smartly.

  +----------------  scope[0] subprogram
  |
  | +--------------  scope[1] lexical_block
  | |
  | | +------------  scope[2] inlined_subroutine
  | | |
  | | | +----------  scope[3] inlined_subroutine
  | | | |
  | | | | +--------  scope[4] lexical_block
  | | | | |
  | | | | |     ***  target instruction
  ...

Image the target instruction has 5 scopes, each scope will have its own
variables and parameters.  Then it can start with the innermost scope
(4).  So it'd search the shortest path from the start of scope[4] to
the target address and build a list of basic blocks.  Then it iterates
the basic blocks with the variables in the scope and update the table.
If it finds a type at the target instruction, then returns it.

Otherwise, it moves to the upper scope[3].  Now it'd search the shortest
path from the start of scope[3] to the start of scope[4].  Then connect
it to the existing basic block list.  Then it'd iterate the blocks with
variables for both scopes.  It can repeat this until it finds a type at
the target instruction or reaches to the top scope[0].

As the basic blocks contain the shortest path, it won't worry about
branches and can update the table simply.

The final check will be done by find_matching_type() in the next patch.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240319055115.4063940-15-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:29 -03:00
Namhyung Kim
1ebb5e17ef perf annotate-data: Add get_global_var_type()
Accessing global variable is common when it tracks execution later.
Factor out the common code into a function for later use.

It adds thread and cpumode to struct data_loc_info to find (global)
symbols if needed.  Also remove var_name as it's retrieved in the
helper function.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240319055115.4063940-12-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:28 -03:00
Namhyung Kim
4f903455be perf annotate-data: Add update_insn_state()
The update_insn_state() function is to update the type state table after
processing each instruction.  For now, it handles MOV (on x86) insn
to transfer type info from the source location to the target.

The location can be a register or a stack slot.  Check carefully when
memory reference happens and fetch the type correctly.  It basically
ignores write to a memory since it doesn't change the type info.  One
exception is writes to (new) stack slots for register spilling.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240319055115.4063940-11-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:28 -03:00
Namhyung Kim
a3f4d5b57d perf annotate-data: Introduce 'struct data_loc_info'
The find_data_type() needs many information to describe the location of
the data.  Add the new 'struct data_loc_info' to pass those information at
once.

No functional changes intended.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240319055115.4063940-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:28 -03:00
Namhyung Kim
5f7cdde843 perf annotate-data: Support global variables
Global variables are accessed using PC-relative address so it needs to
be handled separately.  The PC-rel addressing is detected by using
DWARF_REG_PC.  On x86, %rip register would be used.

The address can be calculated using the ip and offset in the
instruction.  But it should start from the next instruction so add
calculate_pcrel_addr() to do it properly.

But global variables defined in a different file would only have a
declaration which doesn't include a location list.  So it first tries
to get the type info using the address, and then looks up the variable
declarations using name.  The name of global variables should be get
from the symbol table.  The declaration would have the type info.

So extend find_var_type() to take both address and name for global
variables.

The stat is now looks like:

  Annotate data type stats:
  total 294, ok 153 (52.0%), bad 141 (48.0%)
  -----------------------------------------------------------
          30 : no_sym
          32 : no_mem_ops
          61 : no_var
          10 : no_typeinfo
           8 : bad_offset

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-7-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22 12:08:20 -08:00
Namhyung Kim
7a54f1d83d perf annotate-data: Add stack operation pseudo type
A typical function prologue and epilogue include multiple stack
operations to save and restore the current value of registers.
On x86, it looks like below:

  push  r15
  push  r14
  push  r13
  push  r12

  ...

  pop   r12
  pop   r13
  pop   r14
  pop   r15
  ret

As these all touches the stack memory region, chances are high that they
appear in a memory profile data.  But these are not used for any real
purpose yet so it'd return no types.

One of my profile type shows that non neglible portion of data came from
the stack operations.  It also seems GCC generates more stack operations
than clang.

Annotate Instruction stats
total 264, ok 169 (64.0%), bad 95 (36.0%)

    Name      :  Good   Bad
  -----------------------------------------------------------
    movq      :    49    27
    movl      :    24     9
    popq      :     0    19   <-- here
    cmpl      :    17     2
    addq      :    14     1
    cmpq      :    12     2
    cmpxchgl  :     3     7

Instead of dealing them as unknown, let's create a seperate pseudo type
to represent those stack operations separately.

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-5-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22 12:08:20 -08:00
Namhyung Kim
d3030191d3 perf annotate-data: Handle array style accesses
On x86, instructions for array access often looks like below.

  mov  0x1234(%rax,%rbx,8), %rcx

Usually the first register holds the type information and the second one
has the index.  And the current code only looks up a variable for the
first register.  But it's possible to be in the other way around so it
needs to check the second register if the first one failed.

The stat changed like this.

  Annotate data type stats:
  total 294, ok 148 (50.3%), bad 146 (49.7%)
  -----------------------------------------------------------
          30 : no_sym
          32 : no_mem_ops
          66 : no_var
          10 : no_typeinfo
           8 : bad_offset

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-4-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-22 12:08:19 -08:00
Namhyung Kim
61a9741e9f perf annotate: Add --type-stat option for debugging
The --type-stat option is to be used with --data-type and to print
detailed failure reasons for the data type annotation.

  $ perf annotate --data-type --type-stat
  Annotate data type stats:
  total 294, ok 116 (39.5%), bad 178 (60.5%)
  -----------------------------------------------------------
          30 : no_sym
          40 : no_insn_ops
          33 : no_mem_ops
          63 : no_var
           4 : no_typeinfo
           8 : bad_offset

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231213001323.718046-17-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-23 22:40:13 -03:00
Namhyung Kim
9bd7ddd157 perf annotate-data: Update sample histogram for type
The annotated_data_type__update_samples() to get histogram for data type
access.

It'll be called by perf annotate to show which fields in the data type
are accessed frequently.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231213001323.718046-12-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-23 22:39:42 -03:00
Namhyung Kim
4a111cadac perf annotate-data: Add member field in the data type
Add child member field if the current type is a composite type like a
struct or union.  The member fields are linked in the children list and
do the same recursively if the child itself is a composite type.

Add 'self' member to the annotated_data_type to handle the members in
the same way.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231213001323.718046-11-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-23 22:39:42 -03:00
Namhyung Kim
2f2c41bdd8 perf report: Add 'type' sort key
The 'type' sort key is to aggregate hist entries by data type they
access.  Add mem_type field to hist_entry struct to save the type.  If
hist_entry__get_data_type() returns NULL, it'd use the 'unknown_type'
instance.

Committer testing:

Before:

  # perf mem record  sleep 2s
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.037 MB perf.data (4 samples) ]
  root@number:/home/acme/Downloads# perf report --stdio -s type
  Error:
  Unknown --sort key: `type'
   Usage: perf report [<options>]

      -s, --sort <key[,key2...]>
                            sort by key(s): overhead overhead_sys overhead_us overhead_guest_sys
                            overhead_guest_us overhead_children sample period
                            pid comm dso symbol parent cpu socket srcline srcfile
                            local_weight weight transaction trace symbol_size
                            dso_size cgroup cgroup_id ipc_null time code_page_size
                            local_ins_lat ins_lat local_p_stage_cyc p_stage_cyc
                            addr local_retire_lat retire_lat simd dso_from dso_to
                            symbol_from symbol_to mispredict abort in_tx cycles
                            srcline_from srcline_to ipc_lbr addr_from addr_to
                            symbol_daddr dso_daddr locked tlb mem snoop dcacheline
                            symbol_iaddr phys_daddr data_page_size blocked
  #

After:

  # perf report --stdio -s type
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 4  of event 'cpu_atom/mem-loads,ldlat=30/P'
  # Event count (approx.): 7
  #
  # Overhead  Data Type
  # ........  .........
  #
     100.00%  (unknown)

  #
  # (Tip: Print event counts in CSV format with: perf stat -x,)
  #
  # rpm -q kernel-debuginfo
  kernel-debuginfo-6.6.4-200.fc39.x86_64
  # uname -r
  6.6.4-200.fc39.x86_64
  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org>
Cc: linux-trace-devel@vger.kernel.org>
Link: https://lore.kernel.org/r/20231213001323.718046-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-23 22:39:42 -03:00
Namhyung Kim
fc044c53b9 perf annotate-data: Add dso->data_types tree
To aggregate accesses to the same data type, add 'data_types' tree in
DSO to maintain data types and find it by name and size.

It might have different data types that happen to have the same name,
so it also compares the size of the type.

Even if it doesn't 100% guarantee, it reduces the possibility of
mis-handling of such conflicts.

And I don't think it's common to have different types with the same
name.

Committer notes:

Very few cases on the Linux kernel, but there are some different types
with the same name, unsure if there is a debug mode in libbpf dedup that
warns about such cases, but there are provisions in pahole for that,
see:

  "emit: Notice type shadowing, i.e. multiple types with the same name (enum, struct, union, etc)"
    https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=4f332dbfd02072e4f410db7bdcda8d6e3422974b

  $ pahole --compile > vmlinux.h
  $ rm -f a ; make a
  cc     a.c   -o a
  $ grep __[0-9] vmlinux.h
  union irte__1 {
  struct map_info__1;
  struct map_info__1 {
  	struct map_info__1 *       next;                 /*     0     8 */
  $

  drivers/iommu/amd/amd_iommu_types.h 'union irte'
  include/linux/dmar.h                'struct irte'

  include/linux/device-mapper.h:

    union map_info {
            void *ptr;
    };

  include/linux/mtd/map.h:

    struct map_info {
        const char *name;
        unsigned long size;
        resource_size_t phys;
   <SNIP>

  kernel/events/uprobes.c:

   struct map_info {
        struct map_info *next;
        struct mm_struct *mm;
        unsigned long vaddr;
  };

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231213001323.718046-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-23 22:39:42 -03:00
Namhyung Kim
b9c87f536c perf annotate-data: Add find_data_type() to get type from memory access
The find_data_type() is to get a data type from the memory access at the
given address (IP) using a register and an offset.

It requires DWARF debug info in the DSO and searches the list of
variables and function parameters in the scope.

In a pseudo code, it does basically the following:

  find_data_type(dso, ip, reg, offset)
  {
      pc = map__rip_2objdump(ip);
      CU = dwarf_addrdie(dso->dwarf, pc);
      scopes = die_get_scopes(CU, pc);
      for_each_scope(S, scopes) {
          V = die_find_variable_by_reg(S, pc, reg);
          if (V && V.type == pointer_type) {
              T = die_get_real_type(V);
              if (offset < T.size)
                  return T;
          }
      }
      return NULL;
  }

Committer notes:

The 'size' variable in check_variable() is 64-bit, so use PRIu64 and
inttypes.h to debug it.

Ditto at find_data_type_die().

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: linux-toolchains@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20231213001323.718046-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-23 22:39:18 -03:00