linux/arch/riscv/kernel/mcount-dyn.S

192 lines
4.3 KiB
ArmAsm
Raw Normal View History

/* SPDX-License-Identifier: GPL-2.0 */
/* Copyright (C) 2017 Andes Technology Corporation */
#include <linux/init.h>
#include <linux/linkage.h>
#include <linux/export.h>
#include <asm/asm.h>
#include <asm/csr.h>
#include <asm/unistd.h>
#include <asm/thread_info.h>
#include <asm/asm-offsets.h>
#include <asm/ftrace.h>
.text
riscv: ftrace: Reduce the detour code size to half Use a temporary register to reduce the size of detour code from 16 bytes to 8 bytes. The previous implementation is from 'commit afc76b8b8011 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT")'. Before the patch: <func_prolog>: 0: REG_S ra, -SZREG(sp) 4: auipc ra, ? 8: jalr ?(ra) 12: REG_L ra, -SZREG(sp) (func_boddy) After the patch: <func_prolog>: 0: auipc t0, ? 4: jalr t0, ?(t0) (func_boddy) This patch not just reduces the size of detour code, but also fixes an important issue: An Ftrace callback registered with FTRACE_OPS_FL_IPMODIFY flag can actually change the instruction pointer, e.g. to "replace" the given kernel function with a new one, which is needed for livepatching, etc. In this case, the trampoline (ftrace_regs_caller) would not return to <func_prolog+12> but would rather jump to the new function. So, "REG_L ra, -SZREG(sp)" would not run and the original return address would not be restored. The kernel is likely to hang or crash as a result. This can be easily demonstrated if one tries to "replace", say, cmdline_proc_show() with a new function with the same signature using instruction_pointer_set(&fregs->regs, new_func_addr) in the Ftrace callback. Link: https://lore.kernel.org/linux-riscv/20221122075440.1165172-1-suagrfillet@gmail.com/ Link: https://lore.kernel.org/linux-riscv/d7d5730b-ebef-68e5-5046-e763e1ee6164@yadro.com/ Co-developed-by: Song Shuai <suagrfillet@gmail.com> Signed-off-by: Song Shuai <suagrfillet@gmail.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Cc: Evgenii Shatokhin <e.shatokhin@yadro.com> Reviewed-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Link: https://lore.kernel.org/r/20230112090603.1295340-4-guoren@kernel.org Cc: stable@vger.kernel.org Fixes: 10626c32e382 ("riscv/ftrace: Add basic support") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-12 04:05:59 -05:00
#define FENTRY_RA_OFFSET 8
#define ABI_SIZE_ON_STACK 80
#define ABI_A0 0
#define ABI_A1 8
#define ABI_A2 16
#define ABI_A3 24
#define ABI_A4 32
#define ABI_A5 40
#define ABI_A6 48
#define ABI_A7 56
riscv: ftrace: Reduce the detour code size to half Use a temporary register to reduce the size of detour code from 16 bytes to 8 bytes. The previous implementation is from 'commit afc76b8b8011 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT")'. Before the patch: <func_prolog>: 0: REG_S ra, -SZREG(sp) 4: auipc ra, ? 8: jalr ?(ra) 12: REG_L ra, -SZREG(sp) (func_boddy) After the patch: <func_prolog>: 0: auipc t0, ? 4: jalr t0, ?(t0) (func_boddy) This patch not just reduces the size of detour code, but also fixes an important issue: An Ftrace callback registered with FTRACE_OPS_FL_IPMODIFY flag can actually change the instruction pointer, e.g. to "replace" the given kernel function with a new one, which is needed for livepatching, etc. In this case, the trampoline (ftrace_regs_caller) would not return to <func_prolog+12> but would rather jump to the new function. So, "REG_L ra, -SZREG(sp)" would not run and the original return address would not be restored. The kernel is likely to hang or crash as a result. This can be easily demonstrated if one tries to "replace", say, cmdline_proc_show() with a new function with the same signature using instruction_pointer_set(&fregs->regs, new_func_addr) in the Ftrace callback. Link: https://lore.kernel.org/linux-riscv/20221122075440.1165172-1-suagrfillet@gmail.com/ Link: https://lore.kernel.org/linux-riscv/d7d5730b-ebef-68e5-5046-e763e1ee6164@yadro.com/ Co-developed-by: Song Shuai <suagrfillet@gmail.com> Signed-off-by: Song Shuai <suagrfillet@gmail.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Cc: Evgenii Shatokhin <e.shatokhin@yadro.com> Reviewed-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Link: https://lore.kernel.org/r/20230112090603.1295340-4-guoren@kernel.org Cc: stable@vger.kernel.org Fixes: 10626c32e382 ("riscv/ftrace: Add basic support") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-12 04:05:59 -05:00
#define ABI_T0 64
#define ABI_RA 72
.macro SAVE_ABI
addi sp, sp, -ABI_SIZE_ON_STACK
REG_S a0, ABI_A0(sp)
REG_S a1, ABI_A1(sp)
REG_S a2, ABI_A2(sp)
REG_S a3, ABI_A3(sp)
REG_S a4, ABI_A4(sp)
REG_S a5, ABI_A5(sp)
REG_S a6, ABI_A6(sp)
REG_S a7, ABI_A7(sp)
riscv: ftrace: Reduce the detour code size to half Use a temporary register to reduce the size of detour code from 16 bytes to 8 bytes. The previous implementation is from 'commit afc76b8b8011 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT")'. Before the patch: <func_prolog>: 0: REG_S ra, -SZREG(sp) 4: auipc ra, ? 8: jalr ?(ra) 12: REG_L ra, -SZREG(sp) (func_boddy) After the patch: <func_prolog>: 0: auipc t0, ? 4: jalr t0, ?(t0) (func_boddy) This patch not just reduces the size of detour code, but also fixes an important issue: An Ftrace callback registered with FTRACE_OPS_FL_IPMODIFY flag can actually change the instruction pointer, e.g. to "replace" the given kernel function with a new one, which is needed for livepatching, etc. In this case, the trampoline (ftrace_regs_caller) would not return to <func_prolog+12> but would rather jump to the new function. So, "REG_L ra, -SZREG(sp)" would not run and the original return address would not be restored. The kernel is likely to hang or crash as a result. This can be easily demonstrated if one tries to "replace", say, cmdline_proc_show() with a new function with the same signature using instruction_pointer_set(&fregs->regs, new_func_addr) in the Ftrace callback. Link: https://lore.kernel.org/linux-riscv/20221122075440.1165172-1-suagrfillet@gmail.com/ Link: https://lore.kernel.org/linux-riscv/d7d5730b-ebef-68e5-5046-e763e1ee6164@yadro.com/ Co-developed-by: Song Shuai <suagrfillet@gmail.com> Signed-off-by: Song Shuai <suagrfillet@gmail.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Cc: Evgenii Shatokhin <e.shatokhin@yadro.com> Reviewed-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Link: https://lore.kernel.org/r/20230112090603.1295340-4-guoren@kernel.org Cc: stable@vger.kernel.org Fixes: 10626c32e382 ("riscv/ftrace: Add basic support") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-12 04:05:59 -05:00
REG_S t0, ABI_T0(sp)
REG_S ra, ABI_RA(sp)
.endm
.macro RESTORE_ABI
REG_L a0, ABI_A0(sp)
REG_L a1, ABI_A1(sp)
REG_L a2, ABI_A2(sp)
REG_L a3, ABI_A3(sp)
REG_L a4, ABI_A4(sp)
REG_L a5, ABI_A5(sp)
REG_L a6, ABI_A6(sp)
REG_L a7, ABI_A7(sp)
riscv: ftrace: Reduce the detour code size to half Use a temporary register to reduce the size of detour code from 16 bytes to 8 bytes. The previous implementation is from 'commit afc76b8b8011 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT")'. Before the patch: <func_prolog>: 0: REG_S ra, -SZREG(sp) 4: auipc ra, ? 8: jalr ?(ra) 12: REG_L ra, -SZREG(sp) (func_boddy) After the patch: <func_prolog>: 0: auipc t0, ? 4: jalr t0, ?(t0) (func_boddy) This patch not just reduces the size of detour code, but also fixes an important issue: An Ftrace callback registered with FTRACE_OPS_FL_IPMODIFY flag can actually change the instruction pointer, e.g. to "replace" the given kernel function with a new one, which is needed for livepatching, etc. In this case, the trampoline (ftrace_regs_caller) would not return to <func_prolog+12> but would rather jump to the new function. So, "REG_L ra, -SZREG(sp)" would not run and the original return address would not be restored. The kernel is likely to hang or crash as a result. This can be easily demonstrated if one tries to "replace", say, cmdline_proc_show() with a new function with the same signature using instruction_pointer_set(&fregs->regs, new_func_addr) in the Ftrace callback. Link: https://lore.kernel.org/linux-riscv/20221122075440.1165172-1-suagrfillet@gmail.com/ Link: https://lore.kernel.org/linux-riscv/d7d5730b-ebef-68e5-5046-e763e1ee6164@yadro.com/ Co-developed-by: Song Shuai <suagrfillet@gmail.com> Signed-off-by: Song Shuai <suagrfillet@gmail.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Cc: Evgenii Shatokhin <e.shatokhin@yadro.com> Reviewed-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Link: https://lore.kernel.org/r/20230112090603.1295340-4-guoren@kernel.org Cc: stable@vger.kernel.org Fixes: 10626c32e382 ("riscv/ftrace: Add basic support") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-12 04:05:59 -05:00
REG_L t0, ABI_T0(sp)
REG_L ra, ABI_RA(sp)
addi sp, sp, ABI_SIZE_ON_STACK
.endm
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
/**
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
* SAVE_ABI_REGS - save regs against the ftrace_regs struct
*
* After the stack is established,
*
* 0(sp) stores the PC of the traced function which can be accessed
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
* by &(fregs)->epc in tracing function. Note that the real
* function entry address should be computed with -FENTRY_RA_OFFSET.
*
* 8(sp) stores the function return address (i.e. parent IP) that
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
* can be accessed by &(fregs)->ra in tracing function.
*
* The other regs are saved at the respective localtion and accessed
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
* by the respective ftrace_regs member.
*
* Here is the layout of stack for your reference.
*
* PT_SIZE_ON_STACK -> +++++++++
* + ..... +
* + a0-a7 + --++++-> ftrace_caller saved
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
* + t1 + --++++-> direct tramp address
* + s0 + --+ // frame pointer
* + sp + +
* + ra + --+ // parent IP
* sp -> + epc + --+ // PC
* +++++++++
**/
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
.macro SAVE_ABI_REGS
mv t4, sp // Save original SP in T4
addi sp, sp, -FREGS_SIZE_ON_STACK
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
REG_S t0, FREGS_EPC(sp)
REG_S x1, FREGS_RA(sp)
REG_S t4, FREGS_SP(sp) // Put original SP on stack
#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
REG_S x8, FREGS_S0(sp)
#endif
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
REG_S x6, FREGS_T1(sp)
// save the arguments
REG_S x10, FREGS_A0(sp)
REG_S x11, FREGS_A1(sp)
REG_S x12, FREGS_A2(sp)
REG_S x13, FREGS_A3(sp)
REG_S x14, FREGS_A4(sp)
REG_S x15, FREGS_A5(sp)
REG_S x16, FREGS_A6(sp)
REG_S x17, FREGS_A7(sp)
.endm
.macro RESTORE_ABI_REGS, all=0
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
REG_L t0, FREGS_EPC(sp)
REG_L x1, FREGS_RA(sp)
#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
REG_L x8, FREGS_S0(sp)
#endif
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
REG_L x6, FREGS_T1(sp)
// restore the arguments
REG_L x10, FREGS_A0(sp)
REG_L x11, FREGS_A1(sp)
REG_L x12, FREGS_A2(sp)
REG_L x13, FREGS_A3(sp)
REG_L x14, FREGS_A4(sp)
REG_L x15, FREGS_A5(sp)
REG_L x16, FREGS_A6(sp)
REG_L x17, FREGS_A7(sp)
addi sp, sp, FREGS_SIZE_ON_STACK
.endm
.macro PREPARE_ARGS
addi a0, t0, -FENTRY_RA_OFFSET
la a1, function_trace_op
REG_L a2, 0(a1)
mv a1, ra
mv a3, sp
.endm
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
#endif /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
#ifndef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
SYM_FUNC_START(ftrace_caller)
SAVE_ABI
riscv: ftrace: Reduce the detour code size to half Use a temporary register to reduce the size of detour code from 16 bytes to 8 bytes. The previous implementation is from 'commit afc76b8b8011 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT")'. Before the patch: <func_prolog>: 0: REG_S ra, -SZREG(sp) 4: auipc ra, ? 8: jalr ?(ra) 12: REG_L ra, -SZREG(sp) (func_boddy) After the patch: <func_prolog>: 0: auipc t0, ? 4: jalr t0, ?(t0) (func_boddy) This patch not just reduces the size of detour code, but also fixes an important issue: An Ftrace callback registered with FTRACE_OPS_FL_IPMODIFY flag can actually change the instruction pointer, e.g. to "replace" the given kernel function with a new one, which is needed for livepatching, etc. In this case, the trampoline (ftrace_regs_caller) would not return to <func_prolog+12> but would rather jump to the new function. So, "REG_L ra, -SZREG(sp)" would not run and the original return address would not be restored. The kernel is likely to hang or crash as a result. This can be easily demonstrated if one tries to "replace", say, cmdline_proc_show() with a new function with the same signature using instruction_pointer_set(&fregs->regs, new_func_addr) in the Ftrace callback. Link: https://lore.kernel.org/linux-riscv/20221122075440.1165172-1-suagrfillet@gmail.com/ Link: https://lore.kernel.org/linux-riscv/d7d5730b-ebef-68e5-5046-e763e1ee6164@yadro.com/ Co-developed-by: Song Shuai <suagrfillet@gmail.com> Signed-off-by: Song Shuai <suagrfillet@gmail.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Cc: Evgenii Shatokhin <e.shatokhin@yadro.com> Reviewed-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Link: https://lore.kernel.org/r/20230112090603.1295340-4-guoren@kernel.org Cc: stable@vger.kernel.org Fixes: 10626c32e382 ("riscv/ftrace: Add basic support") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-12 04:05:59 -05:00
addi a0, t0, -FENTRY_RA_OFFSET
la a1, function_trace_op
REG_L a2, 0(a1)
riscv: ftrace: Reduce the detour code size to half Use a temporary register to reduce the size of detour code from 16 bytes to 8 bytes. The previous implementation is from 'commit afc76b8b8011 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT")'. Before the patch: <func_prolog>: 0: REG_S ra, -SZREG(sp) 4: auipc ra, ? 8: jalr ?(ra) 12: REG_L ra, -SZREG(sp) (func_boddy) After the patch: <func_prolog>: 0: auipc t0, ? 4: jalr t0, ?(t0) (func_boddy) This patch not just reduces the size of detour code, but also fixes an important issue: An Ftrace callback registered with FTRACE_OPS_FL_IPMODIFY flag can actually change the instruction pointer, e.g. to "replace" the given kernel function with a new one, which is needed for livepatching, etc. In this case, the trampoline (ftrace_regs_caller) would not return to <func_prolog+12> but would rather jump to the new function. So, "REG_L ra, -SZREG(sp)" would not run and the original return address would not be restored. The kernel is likely to hang or crash as a result. This can be easily demonstrated if one tries to "replace", say, cmdline_proc_show() with a new function with the same signature using instruction_pointer_set(&fregs->regs, new_func_addr) in the Ftrace callback. Link: https://lore.kernel.org/linux-riscv/20221122075440.1165172-1-suagrfillet@gmail.com/ Link: https://lore.kernel.org/linux-riscv/d7d5730b-ebef-68e5-5046-e763e1ee6164@yadro.com/ Co-developed-by: Song Shuai <suagrfillet@gmail.com> Signed-off-by: Song Shuai <suagrfillet@gmail.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Cc: Evgenii Shatokhin <e.shatokhin@yadro.com> Reviewed-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Link: https://lore.kernel.org/r/20230112090603.1295340-4-guoren@kernel.org Cc: stable@vger.kernel.org Fixes: 10626c32e382 ("riscv/ftrace: Add basic support") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-12 04:05:59 -05:00
mv a1, ra
mv a3, sp
SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
call ftrace_stub
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
riscv: ftrace: Reduce the detour code size to half Use a temporary register to reduce the size of detour code from 16 bytes to 8 bytes. The previous implementation is from 'commit afc76b8b8011 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT")'. Before the patch: <func_prolog>: 0: REG_S ra, -SZREG(sp) 4: auipc ra, ? 8: jalr ?(ra) 12: REG_L ra, -SZREG(sp) (func_boddy) After the patch: <func_prolog>: 0: auipc t0, ? 4: jalr t0, ?(t0) (func_boddy) This patch not just reduces the size of detour code, but also fixes an important issue: An Ftrace callback registered with FTRACE_OPS_FL_IPMODIFY flag can actually change the instruction pointer, e.g. to "replace" the given kernel function with a new one, which is needed for livepatching, etc. In this case, the trampoline (ftrace_regs_caller) would not return to <func_prolog+12> but would rather jump to the new function. So, "REG_L ra, -SZREG(sp)" would not run and the original return address would not be restored. The kernel is likely to hang or crash as a result. This can be easily demonstrated if one tries to "replace", say, cmdline_proc_show() with a new function with the same signature using instruction_pointer_set(&fregs->regs, new_func_addr) in the Ftrace callback. Link: https://lore.kernel.org/linux-riscv/20221122075440.1165172-1-suagrfillet@gmail.com/ Link: https://lore.kernel.org/linux-riscv/d7d5730b-ebef-68e5-5046-e763e1ee6164@yadro.com/ Co-developed-by: Song Shuai <suagrfillet@gmail.com> Signed-off-by: Song Shuai <suagrfillet@gmail.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Cc: Evgenii Shatokhin <e.shatokhin@yadro.com> Reviewed-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Link: https://lore.kernel.org/r/20230112090603.1295340-4-guoren@kernel.org Cc: stable@vger.kernel.org Fixes: 10626c32e382 ("riscv/ftrace: Add basic support") Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-01-12 04:05:59 -05:00
addi a0, sp, ABI_RA
REG_L a1, ABI_T0(sp)
addi a1, a1, -FENTRY_RA_OFFSET
#ifdef HAVE_FUNCTION_GRAPH_FP_TEST
mv a2, s0
#endif
SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL)
call ftrace_stub
#endif
RESTORE_ABI
jr t0
SYM_FUNC_END(ftrace_caller)
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
#else /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */
SYM_FUNC_START(ftrace_caller)
mv t1, zero
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
SAVE_ABI_REGS
PREPARE_ARGS
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
call ftrace_stub
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
RESTORE_ABI_REGS
bnez t1, .Ldirect
jr t0
.Ldirect:
jr t1
SYM_FUNC_END(ftrace_caller)
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
#endif /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
SYM_CODE_START(ftrace_stub_direct_tramp)
jr t0
SYM_CODE_END(ftrace_stub_direct_tramp)
#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */