linux/arch/riscv/kernel/asm-offsets.c

515 lines
21 KiB
C
Raw Permalink Normal View History

// SPDX-License-Identifier: GPL-2.0-only
/*
* Copyright (C) 2012 Regents of the University of California
* Copyright (C) 2017 SiFive
*/
#include <linux/kbuild.h>
#include <linux/mm.h>
#include <linux/sched.h>
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
#include <linux/ftrace.h>
RISC-V: Add arch functions to support hibernation/suspend-to-disk Low level Arch functions were created to support hibernation. swsusp_arch_suspend() relies code from __cpu_suspend_enter() to write cpu state onto the stack, then calling swsusp_save() to save the memory image. Arch specific hibernation header is implemented and is utilized by the arch_hibernation_header_restore() and arch_hibernation_header_save() functions. The arch specific hibernation header consists of satp, hartid, and the cpu_resume address. The kernel built version is also need to be saved into the hibernation image header to making sure only the same kernel is restore when resume. swsusp_arch_resume() creates a temporary page table that covering only the linear map. It copies the restore code to a 'safe' page, then start to restore the memory image. Once completed, it restores the original kernel's page table. It then calls into __hibernate_cpu_resume() to restore the CPU context. Finally, it follows the normal hibernation path back to the hibernation core. To enable hibernation/suspend to disk into RISCV, the below config need to be enabled: - CONFIG_HIBERNATION - CONFIG_ARCH_HIBERNATION_HEADER - CONFIG_ARCH_HIBERNATION_POSSIBLE Signed-off-by: Sia Jee Heng <jeeheng.sia@starfivetech.com> Reviewed-by: Ley Foon Tan <leyfoon.tan@starfivetech.com> Reviewed-by: Mason Huo <mason.huo@starfivetech.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230330064321.1008373-5-jeeheng.sia@starfivetech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-30 14:43:21 +08:00
#include <linux/suspend.h>
#include <asm/kvm_host.h>
#include <asm/thread_info.h>
#include <asm/ptrace.h>
#include <asm/cpu_ops_sbi.h>
#include <asm/stacktrace.h>
#include <asm/suspend.h>
void asm_offsets(void);
void asm_offsets(void)
{
OFFSET(TASK_THREAD_RA, task_struct, thread.ra);
OFFSET(TASK_THREAD_SP, task_struct, thread.sp);
OFFSET(TASK_THREAD_S0, task_struct, thread.s[0]);
OFFSET(TASK_THREAD_S1, task_struct, thread.s[1]);
OFFSET(TASK_THREAD_S2, task_struct, thread.s[2]);
OFFSET(TASK_THREAD_S3, task_struct, thread.s[3]);
OFFSET(TASK_THREAD_S4, task_struct, thread.s[4]);
OFFSET(TASK_THREAD_S5, task_struct, thread.s[5]);
OFFSET(TASK_THREAD_S6, task_struct, thread.s[6]);
OFFSET(TASK_THREAD_S7, task_struct, thread.s[7]);
OFFSET(TASK_THREAD_S8, task_struct, thread.s[8]);
OFFSET(TASK_THREAD_S9, task_struct, thread.s[9]);
OFFSET(TASK_THREAD_S10, task_struct, thread.s[10]);
OFFSET(TASK_THREAD_S11, task_struct, thread.s[11]);
riscv: Stop emitting preventive sfence.vma for new vmalloc mappings In 6.5, we removed the vmalloc fault path because that can't work (see [1] [2]). Then in order to make sure that new page table entries were seen by the page table walker, we had to preventively emit a sfence.vma on all harts [3] but this solution is very costly since it relies on IPI. And even there, we could end up in a loop of vmalloc faults if a vmalloc allocation is done in the IPI path (for example if it is traced, see [4]), which could result in a kernel stack overflow. Those preventive sfence.vma needed to be emitted because: - if the uarch caches invalid entries, the new mapping may not be observed by the page table walker and an invalidation may be needed. - if the uarch does not cache invalid entries, a reordered access could "miss" the new mapping and traps: in that case, we would actually only need to retry the access, no sfence.vma is required. So this patch removes those preventive sfence.vma and actually handles the possible (and unlikely) exceptions. And since the kernel stacks mappings lie in the vmalloc area, this handling must be done very early when the trap is taken, at the very beginning of handle_exception: this also rules out the vmalloc allocations in the fault path. Link: https://lore.kernel.org/linux-riscv/20230531093817.665799-1-bjorn@kernel.org/ [1] Link: https://lore.kernel.org/linux-riscv/20230801090927.2018653-1-dylan@andestech.com [2] Link: https://lore.kernel.org/linux-riscv/20230725132246.817726-1-alexghiti@rivosinc.com/ [3] Link: https://lore.kernel.org/lkml/20200508144043.13893-1-joro@8bytes.org/ [4] Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com> Link: https://lore.kernel.org/r/20240717060125.139416-4-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-17 08:01:24 +02:00
OFFSET(TASK_TI_CPU, task_struct, thread_info.cpu);
OFFSET(TASK_TI_FLAGS, task_struct, thread_info.flags);
OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count);
OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp);
OFFSET(TASK_TI_USER_SP, task_struct, thread_info.user_sp);
#ifdef CONFIG_SHADOW_CALL_STACK
OFFSET(TASK_TI_SCS_SP, task_struct, thread_info.scs_sp);
#endif
riscv: Stop emitting preventive sfence.vma for new vmalloc mappings In 6.5, we removed the vmalloc fault path because that can't work (see [1] [2]). Then in order to make sure that new page table entries were seen by the page table walker, we had to preventively emit a sfence.vma on all harts [3] but this solution is very costly since it relies on IPI. And even there, we could end up in a loop of vmalloc faults if a vmalloc allocation is done in the IPI path (for example if it is traced, see [4]), which could result in a kernel stack overflow. Those preventive sfence.vma needed to be emitted because: - if the uarch caches invalid entries, the new mapping may not be observed by the page table walker and an invalidation may be needed. - if the uarch does not cache invalid entries, a reordered access could "miss" the new mapping and traps: in that case, we would actually only need to retry the access, no sfence.vma is required. So this patch removes those preventive sfence.vma and actually handles the possible (and unlikely) exceptions. And since the kernel stacks mappings lie in the vmalloc area, this handling must be done very early when the trap is taken, at the very beginning of handle_exception: this also rules out the vmalloc allocations in the fault path. Link: https://lore.kernel.org/linux-riscv/20230531093817.665799-1-bjorn@kernel.org/ [1] Link: https://lore.kernel.org/linux-riscv/20230801090927.2018653-1-dylan@andestech.com [2] Link: https://lore.kernel.org/linux-riscv/20230725132246.817726-1-alexghiti@rivosinc.com/ [3] Link: https://lore.kernel.org/lkml/20200508144043.13893-1-joro@8bytes.org/ [4] Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com> Link: https://lore.kernel.org/r/20240717060125.139416-4-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-17 08:01:24 +02:00
#ifdef CONFIG_64BIT
OFFSET(TASK_TI_A0, task_struct, thread_info.a0);
OFFSET(TASK_TI_A1, task_struct, thread_info.a1);
OFFSET(TASK_TI_A2, task_struct, thread_info.a2);
#endif
riscv: VMAP_STACK overflow detection thread-safe commit 31da94c25aea ("riscv: add VMAP_STACK overflow detection") added support for CONFIG_VMAP_STACK. If overflow is detected, CPU switches to `shadow_stack` temporarily before switching finally to per-cpu `overflow_stack`. If two CPUs/harts are racing and end up in over flowing kernel stack, one or both will end up corrupting each other state because `shadow_stack` is not per-cpu. This patch optimizes per-cpu overflow stack switch by directly picking per-cpu `overflow_stack` and gets rid of `shadow_stack`. Following are the changes in this patch - Defines an asm macro to obtain per-cpu symbols in destination register. - In entry.S, when overflow is detected, per-cpu overflow stack is located using per-cpu asm macro. Computing per-cpu symbol requires a temporary register. x31 is saved away into CSR_SCRATCH (CSR_SCRATCH is anyways zero since we're in kernel). Please see Links for additional relevant disccussion and alternative solution. Tested by `echo EXHAUST_STACK > /sys/kernel/debug/provoke-crash/DIRECT` Kernel crash log below Insufficient stack space to handle exception!/debug/provoke-crash/DIRECT Task stack: [0xff20000010a98000..0xff20000010a9c000] Overflow stack: [0xff600001f7d98370..0xff600001f7d99370] CPU: 1 PID: 205 Comm: bash Not tainted 6.1.0-rc2-00001-g328a1f96f7b9 #34 Hardware name: riscv-virtio,qemu (DT) epc : __memset+0x60/0xfc ra : recursive_loop+0x48/0xc6 [lkdtm] epc : ffffffff808de0e4 ra : ffffffff0163a752 sp : ff20000010a97e80 gp : ffffffff815c0330 tp : ff600000820ea280 t0 : ff20000010a97e88 t1 : 000000000000002e t2 : 3233206874706564 s0 : ff20000010a982b0 s1 : 0000000000000012 a0 : ff20000010a97e88 a1 : 0000000000000000 a2 : 0000000000000400 a3 : ff20000010a98288 a4 : 0000000000000000 a5 : 0000000000000000 a6 : fffffffffffe43f0 a7 : 00007fffffffffff s2 : ff20000010a97e88 s3 : ffffffff01644680 s4 : ff20000010a9be90 s5 : ff600000842ba6c0 s6 : 00aaaaaac29e42b0 s7 : 00fffffff0aa3684 s8 : 00aaaaaac2978040 s9 : 0000000000000065 s10: 00ffffff8a7cad10 s11: 00ffffff8a76a4e0 t3 : ffffffff815dbaf4 t4 : ffffffff815dbaf4 t5 : ffffffff815dbab8 t6 : ff20000010a9bb48 status: 0000000200000120 badaddr: ff20000010a97e88 cause: 000000000000000f Kernel panic - not syncing: Kernel stack overflow CPU: 1 PID: 205 Comm: bash Not tainted 6.1.0-rc2-00001-g328a1f96f7b9 #34 Hardware name: riscv-virtio,qemu (DT) Call Trace: [<ffffffff80006754>] dump_backtrace+0x30/0x38 [<ffffffff808de798>] show_stack+0x40/0x4c [<ffffffff808ea2a8>] dump_stack_lvl+0x44/0x5c [<ffffffff808ea2d8>] dump_stack+0x18/0x20 [<ffffffff808dec06>] panic+0x126/0x2fe [<ffffffff800065ea>] walk_stackframe+0x0/0xf0 [<ffffffff0163a752>] recursive_loop+0x48/0xc6 [lkdtm] SMP: stopping secondary CPUs ---[ end Kernel panic - not syncing: Kernel stack overflow ]--- Cc: Guo Ren <guoren@kernel.org> Cc: Jisheng Zhang <jszhang@kernel.org> Link: https://lore.kernel.org/linux-riscv/Y347B0x4VUNOd6V7@xhacker/T/#t Link: https://lore.kernel.org/lkml/20221124094845.1907443-1-debug@rivosinc.com/ Signed-off-by: Deepak Gupta <debug@rivosinc.com> Co-developed-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Guo Ren <guoren@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20230927224757.1154247-9-samitolvanen@google.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-09-27 22:47:59 +00:00
OFFSET(TASK_TI_CPU_NUM, task_struct, thread_info.cpu);
OFFSET(TASK_THREAD_F0, task_struct, thread.fstate.f[0]);
OFFSET(TASK_THREAD_F1, task_struct, thread.fstate.f[1]);
OFFSET(TASK_THREAD_F2, task_struct, thread.fstate.f[2]);
OFFSET(TASK_THREAD_F3, task_struct, thread.fstate.f[3]);
OFFSET(TASK_THREAD_F4, task_struct, thread.fstate.f[4]);
OFFSET(TASK_THREAD_F5, task_struct, thread.fstate.f[5]);
OFFSET(TASK_THREAD_F6, task_struct, thread.fstate.f[6]);
OFFSET(TASK_THREAD_F7, task_struct, thread.fstate.f[7]);
OFFSET(TASK_THREAD_F8, task_struct, thread.fstate.f[8]);
OFFSET(TASK_THREAD_F9, task_struct, thread.fstate.f[9]);
OFFSET(TASK_THREAD_F10, task_struct, thread.fstate.f[10]);
OFFSET(TASK_THREAD_F11, task_struct, thread.fstate.f[11]);
OFFSET(TASK_THREAD_F12, task_struct, thread.fstate.f[12]);
OFFSET(TASK_THREAD_F13, task_struct, thread.fstate.f[13]);
OFFSET(TASK_THREAD_F14, task_struct, thread.fstate.f[14]);
OFFSET(TASK_THREAD_F15, task_struct, thread.fstate.f[15]);
OFFSET(TASK_THREAD_F16, task_struct, thread.fstate.f[16]);
OFFSET(TASK_THREAD_F17, task_struct, thread.fstate.f[17]);
OFFSET(TASK_THREAD_F18, task_struct, thread.fstate.f[18]);
OFFSET(TASK_THREAD_F19, task_struct, thread.fstate.f[19]);
OFFSET(TASK_THREAD_F20, task_struct, thread.fstate.f[20]);
OFFSET(TASK_THREAD_F21, task_struct, thread.fstate.f[21]);
OFFSET(TASK_THREAD_F22, task_struct, thread.fstate.f[22]);
OFFSET(TASK_THREAD_F23, task_struct, thread.fstate.f[23]);
OFFSET(TASK_THREAD_F24, task_struct, thread.fstate.f[24]);
OFFSET(TASK_THREAD_F25, task_struct, thread.fstate.f[25]);
OFFSET(TASK_THREAD_F26, task_struct, thread.fstate.f[26]);
OFFSET(TASK_THREAD_F27, task_struct, thread.fstate.f[27]);
OFFSET(TASK_THREAD_F28, task_struct, thread.fstate.f[28]);
OFFSET(TASK_THREAD_F29, task_struct, thread.fstate.f[29]);
OFFSET(TASK_THREAD_F30, task_struct, thread.fstate.f[30]);
OFFSET(TASK_THREAD_F31, task_struct, thread.fstate.f[31]);
OFFSET(TASK_THREAD_FCSR, task_struct, thread.fstate.fcsr);
riscv: Enable per-task stack canaries This enables the use of per-task stack canary values if GCC has support for emitting the stack canary reference relative to the value of tp, which holds the task struct pointer in the riscv kernel. After compare arm64 and x86 implementations, seems arm64's is more flexible and readable. The key point is how gcc get the offset of stack_canary from gs/el0_sp. x86: Use a fix offset from gs, not flexible. struct fixed_percpu_data { /* * GCC hardcodes the stack canary as %gs:40. Since the * irq_stack is the object at %gs:0, we reserve the bottom * 48 bytes of the irq stack for the canary. */ char gs_base[40]; // :( unsigned long stack_canary; }; arm64: Use -mstack-protector-guard-offset & guard-reg gcc options: -mstack-protector-guard=sysreg -mstack-protector-guard-reg=sp_el0 -mstack-protector-guard-offset=xxx riscv: Use -mstack-protector-guard-offset & guard-reg gcc options: -mstack-protector-guard=tls -mstack-protector-guard-reg=tp -mstack-protector-guard-offset=xxx GCC's implementation has been merged: commit c931e8d5a96463427040b0d11f9c4352ac22b2b0 Author: Cooper Qu <cooper.qu@linux.alibaba.com> Date: Mon Jul 13 16:15:08 2020 +0800 RISC-V: Add support for TLS stack protector canary access In the end, these codes are inserted by gcc before return: * 0xffffffe00020b396 <+120>: ld a5,1008(tp) # 0x3f0 * 0xffffffe00020b39a <+124>: xor a5,a5,a4 * 0xffffffe00020b39c <+126>: mv a0,s5 * 0xffffffe00020b39e <+128>: bnez a5,0xffffffe00020b61c <_do_fork+766> 0xffffffe00020b3a2 <+132>: ld ra,136(sp) 0xffffffe00020b3a4 <+134>: ld s0,128(sp) 0xffffffe00020b3a6 <+136>: ld s1,120(sp) 0xffffffe00020b3a8 <+138>: ld s2,112(sp) 0xffffffe00020b3aa <+140>: ld s3,104(sp) 0xffffffe00020b3ac <+142>: ld s4,96(sp) 0xffffffe00020b3ae <+144>: ld s5,88(sp) 0xffffffe00020b3b0 <+146>: ld s6,80(sp) 0xffffffe00020b3b2 <+148>: ld s7,72(sp) 0xffffffe00020b3b4 <+150>: addi sp,sp,144 0xffffffe00020b3b6 <+152>: ret ... * 0xffffffe00020b61c <+766>: auipc ra,0x7f8 * 0xffffffe00020b620 <+770>: jalr -1764(ra) # 0xffffffe000a02f38 <__stack_chk_fail> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Cooper Qu <cooper.qu@linux.alibaba.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-12-17 16:29:18 +00:00
#ifdef CONFIG_STACKPROTECTOR
OFFSET(TSK_STACK_CANARY, task_struct, stack_canary);
#endif
DEFINE(PT_SIZE, sizeof(struct pt_regs));
OFFSET(PT_EPC, pt_regs, epc);
OFFSET(PT_RA, pt_regs, ra);
OFFSET(PT_FP, pt_regs, s0);
OFFSET(PT_S0, pt_regs, s0);
OFFSET(PT_S1, pt_regs, s1);
OFFSET(PT_S2, pt_regs, s2);
OFFSET(PT_S3, pt_regs, s3);
OFFSET(PT_S4, pt_regs, s4);
OFFSET(PT_S5, pt_regs, s5);
OFFSET(PT_S6, pt_regs, s6);
OFFSET(PT_S7, pt_regs, s7);
OFFSET(PT_S8, pt_regs, s8);
OFFSET(PT_S9, pt_regs, s9);
OFFSET(PT_S10, pt_regs, s10);
OFFSET(PT_S11, pt_regs, s11);
OFFSET(PT_SP, pt_regs, sp);
OFFSET(PT_TP, pt_regs, tp);
OFFSET(PT_A0, pt_regs, a0);
OFFSET(PT_A1, pt_regs, a1);
OFFSET(PT_A2, pt_regs, a2);
OFFSET(PT_A3, pt_regs, a3);
OFFSET(PT_A4, pt_regs, a4);
OFFSET(PT_A5, pt_regs, a5);
OFFSET(PT_A6, pt_regs, a6);
OFFSET(PT_A7, pt_regs, a7);
OFFSET(PT_T0, pt_regs, t0);
OFFSET(PT_T1, pt_regs, t1);
OFFSET(PT_T2, pt_regs, t2);
OFFSET(PT_T3, pt_regs, t3);
OFFSET(PT_T4, pt_regs, t4);
OFFSET(PT_T5, pt_regs, t5);
OFFSET(PT_T6, pt_regs, t6);
OFFSET(PT_GP, pt_regs, gp);
OFFSET(PT_ORIG_A0, pt_regs, orig_a0);
OFFSET(PT_STATUS, pt_regs, status);
OFFSET(PT_BADADDR, pt_regs, badaddr);
OFFSET(PT_CAUSE, pt_regs, cause);
OFFSET(SUSPEND_CONTEXT_REGS, suspend_context, regs);
RISC-V: Add arch functions to support hibernation/suspend-to-disk Low level Arch functions were created to support hibernation. swsusp_arch_suspend() relies code from __cpu_suspend_enter() to write cpu state onto the stack, then calling swsusp_save() to save the memory image. Arch specific hibernation header is implemented and is utilized by the arch_hibernation_header_restore() and arch_hibernation_header_save() functions. The arch specific hibernation header consists of satp, hartid, and the cpu_resume address. The kernel built version is also need to be saved into the hibernation image header to making sure only the same kernel is restore when resume. swsusp_arch_resume() creates a temporary page table that covering only the linear map. It copies the restore code to a 'safe' page, then start to restore the memory image. Once completed, it restores the original kernel's page table. It then calls into __hibernate_cpu_resume() to restore the CPU context. Finally, it follows the normal hibernation path back to the hibernation core. To enable hibernation/suspend to disk into RISCV, the below config need to be enabled: - CONFIG_HIBERNATION - CONFIG_ARCH_HIBERNATION_HEADER - CONFIG_ARCH_HIBERNATION_POSSIBLE Signed-off-by: Sia Jee Heng <jeeheng.sia@starfivetech.com> Reviewed-by: Ley Foon Tan <leyfoon.tan@starfivetech.com> Reviewed-by: Mason Huo <mason.huo@starfivetech.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230330064321.1008373-5-jeeheng.sia@starfivetech.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-30 14:43:21 +08:00
OFFSET(HIBERN_PBE_ADDR, pbe, address);
OFFSET(HIBERN_PBE_ORIG, pbe, orig_address);
OFFSET(HIBERN_PBE_NEXT, pbe, next);
OFFSET(KVM_ARCH_GUEST_ZERO, kvm_vcpu_arch, guest_context.zero);
OFFSET(KVM_ARCH_GUEST_RA, kvm_vcpu_arch, guest_context.ra);
OFFSET(KVM_ARCH_GUEST_SP, kvm_vcpu_arch, guest_context.sp);
OFFSET(KVM_ARCH_GUEST_GP, kvm_vcpu_arch, guest_context.gp);
OFFSET(KVM_ARCH_GUEST_TP, kvm_vcpu_arch, guest_context.tp);
OFFSET(KVM_ARCH_GUEST_T0, kvm_vcpu_arch, guest_context.t0);
OFFSET(KVM_ARCH_GUEST_T1, kvm_vcpu_arch, guest_context.t1);
OFFSET(KVM_ARCH_GUEST_T2, kvm_vcpu_arch, guest_context.t2);
OFFSET(KVM_ARCH_GUEST_S0, kvm_vcpu_arch, guest_context.s0);
OFFSET(KVM_ARCH_GUEST_S1, kvm_vcpu_arch, guest_context.s1);
OFFSET(KVM_ARCH_GUEST_A0, kvm_vcpu_arch, guest_context.a0);
OFFSET(KVM_ARCH_GUEST_A1, kvm_vcpu_arch, guest_context.a1);
OFFSET(KVM_ARCH_GUEST_A2, kvm_vcpu_arch, guest_context.a2);
OFFSET(KVM_ARCH_GUEST_A3, kvm_vcpu_arch, guest_context.a3);
OFFSET(KVM_ARCH_GUEST_A4, kvm_vcpu_arch, guest_context.a4);
OFFSET(KVM_ARCH_GUEST_A5, kvm_vcpu_arch, guest_context.a5);
OFFSET(KVM_ARCH_GUEST_A6, kvm_vcpu_arch, guest_context.a6);
OFFSET(KVM_ARCH_GUEST_A7, kvm_vcpu_arch, guest_context.a7);
OFFSET(KVM_ARCH_GUEST_S2, kvm_vcpu_arch, guest_context.s2);
OFFSET(KVM_ARCH_GUEST_S3, kvm_vcpu_arch, guest_context.s3);
OFFSET(KVM_ARCH_GUEST_S4, kvm_vcpu_arch, guest_context.s4);
OFFSET(KVM_ARCH_GUEST_S5, kvm_vcpu_arch, guest_context.s5);
OFFSET(KVM_ARCH_GUEST_S6, kvm_vcpu_arch, guest_context.s6);
OFFSET(KVM_ARCH_GUEST_S7, kvm_vcpu_arch, guest_context.s7);
OFFSET(KVM_ARCH_GUEST_S8, kvm_vcpu_arch, guest_context.s8);
OFFSET(KVM_ARCH_GUEST_S9, kvm_vcpu_arch, guest_context.s9);
OFFSET(KVM_ARCH_GUEST_S10, kvm_vcpu_arch, guest_context.s10);
OFFSET(KVM_ARCH_GUEST_S11, kvm_vcpu_arch, guest_context.s11);
OFFSET(KVM_ARCH_GUEST_T3, kvm_vcpu_arch, guest_context.t3);
OFFSET(KVM_ARCH_GUEST_T4, kvm_vcpu_arch, guest_context.t4);
OFFSET(KVM_ARCH_GUEST_T5, kvm_vcpu_arch, guest_context.t5);
OFFSET(KVM_ARCH_GUEST_T6, kvm_vcpu_arch, guest_context.t6);
OFFSET(KVM_ARCH_GUEST_SEPC, kvm_vcpu_arch, guest_context.sepc);
OFFSET(KVM_ARCH_GUEST_SSTATUS, kvm_vcpu_arch, guest_context.sstatus);
OFFSET(KVM_ARCH_GUEST_HSTATUS, kvm_vcpu_arch, guest_context.hstatus);
OFFSET(KVM_ARCH_GUEST_SCOUNTEREN, kvm_vcpu_arch, guest_csr.scounteren);
OFFSET(KVM_ARCH_HOST_ZERO, kvm_vcpu_arch, host_context.zero);
OFFSET(KVM_ARCH_HOST_RA, kvm_vcpu_arch, host_context.ra);
OFFSET(KVM_ARCH_HOST_SP, kvm_vcpu_arch, host_context.sp);
OFFSET(KVM_ARCH_HOST_GP, kvm_vcpu_arch, host_context.gp);
OFFSET(KVM_ARCH_HOST_TP, kvm_vcpu_arch, host_context.tp);
OFFSET(KVM_ARCH_HOST_T0, kvm_vcpu_arch, host_context.t0);
OFFSET(KVM_ARCH_HOST_T1, kvm_vcpu_arch, host_context.t1);
OFFSET(KVM_ARCH_HOST_T2, kvm_vcpu_arch, host_context.t2);
OFFSET(KVM_ARCH_HOST_S0, kvm_vcpu_arch, host_context.s0);
OFFSET(KVM_ARCH_HOST_S1, kvm_vcpu_arch, host_context.s1);
OFFSET(KVM_ARCH_HOST_A0, kvm_vcpu_arch, host_context.a0);
OFFSET(KVM_ARCH_HOST_A1, kvm_vcpu_arch, host_context.a1);
OFFSET(KVM_ARCH_HOST_A2, kvm_vcpu_arch, host_context.a2);
OFFSET(KVM_ARCH_HOST_A3, kvm_vcpu_arch, host_context.a3);
OFFSET(KVM_ARCH_HOST_A4, kvm_vcpu_arch, host_context.a4);
OFFSET(KVM_ARCH_HOST_A5, kvm_vcpu_arch, host_context.a5);
OFFSET(KVM_ARCH_HOST_A6, kvm_vcpu_arch, host_context.a6);
OFFSET(KVM_ARCH_HOST_A7, kvm_vcpu_arch, host_context.a7);
OFFSET(KVM_ARCH_HOST_S2, kvm_vcpu_arch, host_context.s2);
OFFSET(KVM_ARCH_HOST_S3, kvm_vcpu_arch, host_context.s3);
OFFSET(KVM_ARCH_HOST_S4, kvm_vcpu_arch, host_context.s4);
OFFSET(KVM_ARCH_HOST_S5, kvm_vcpu_arch, host_context.s5);
OFFSET(KVM_ARCH_HOST_S6, kvm_vcpu_arch, host_context.s6);
OFFSET(KVM_ARCH_HOST_S7, kvm_vcpu_arch, host_context.s7);
OFFSET(KVM_ARCH_HOST_S8, kvm_vcpu_arch, host_context.s8);
OFFSET(KVM_ARCH_HOST_S9, kvm_vcpu_arch, host_context.s9);
OFFSET(KVM_ARCH_HOST_S10, kvm_vcpu_arch, host_context.s10);
OFFSET(KVM_ARCH_HOST_S11, kvm_vcpu_arch, host_context.s11);
OFFSET(KVM_ARCH_HOST_T3, kvm_vcpu_arch, host_context.t3);
OFFSET(KVM_ARCH_HOST_T4, kvm_vcpu_arch, host_context.t4);
OFFSET(KVM_ARCH_HOST_T5, kvm_vcpu_arch, host_context.t5);
OFFSET(KVM_ARCH_HOST_T6, kvm_vcpu_arch, host_context.t6);
OFFSET(KVM_ARCH_HOST_SEPC, kvm_vcpu_arch, host_context.sepc);
OFFSET(KVM_ARCH_HOST_SSTATUS, kvm_vcpu_arch, host_context.sstatus);
OFFSET(KVM_ARCH_HOST_HSTATUS, kvm_vcpu_arch, host_context.hstatus);
OFFSET(KVM_ARCH_HOST_SSCRATCH, kvm_vcpu_arch, host_sscratch);
OFFSET(KVM_ARCH_HOST_STVEC, kvm_vcpu_arch, host_stvec);
OFFSET(KVM_ARCH_HOST_SCOUNTEREN, kvm_vcpu_arch, host_scounteren);
OFFSET(KVM_ARCH_TRAP_SEPC, kvm_cpu_trap, sepc);
OFFSET(KVM_ARCH_TRAP_SCAUSE, kvm_cpu_trap, scause);
OFFSET(KVM_ARCH_TRAP_STVAL, kvm_cpu_trap, stval);
OFFSET(KVM_ARCH_TRAP_HTVAL, kvm_cpu_trap, htval);
OFFSET(KVM_ARCH_TRAP_HTINST, kvm_cpu_trap, htinst);
/* F extension */
OFFSET(KVM_ARCH_FP_F_F0, kvm_cpu_context, fp.f.f[0]);
OFFSET(KVM_ARCH_FP_F_F1, kvm_cpu_context, fp.f.f[1]);
OFFSET(KVM_ARCH_FP_F_F2, kvm_cpu_context, fp.f.f[2]);
OFFSET(KVM_ARCH_FP_F_F3, kvm_cpu_context, fp.f.f[3]);
OFFSET(KVM_ARCH_FP_F_F4, kvm_cpu_context, fp.f.f[4]);
OFFSET(KVM_ARCH_FP_F_F5, kvm_cpu_context, fp.f.f[5]);
OFFSET(KVM_ARCH_FP_F_F6, kvm_cpu_context, fp.f.f[6]);
OFFSET(KVM_ARCH_FP_F_F7, kvm_cpu_context, fp.f.f[7]);
OFFSET(KVM_ARCH_FP_F_F8, kvm_cpu_context, fp.f.f[8]);
OFFSET(KVM_ARCH_FP_F_F9, kvm_cpu_context, fp.f.f[9]);
OFFSET(KVM_ARCH_FP_F_F10, kvm_cpu_context, fp.f.f[10]);
OFFSET(KVM_ARCH_FP_F_F11, kvm_cpu_context, fp.f.f[11]);
OFFSET(KVM_ARCH_FP_F_F12, kvm_cpu_context, fp.f.f[12]);
OFFSET(KVM_ARCH_FP_F_F13, kvm_cpu_context, fp.f.f[13]);
OFFSET(KVM_ARCH_FP_F_F14, kvm_cpu_context, fp.f.f[14]);
OFFSET(KVM_ARCH_FP_F_F15, kvm_cpu_context, fp.f.f[15]);
OFFSET(KVM_ARCH_FP_F_F16, kvm_cpu_context, fp.f.f[16]);
OFFSET(KVM_ARCH_FP_F_F17, kvm_cpu_context, fp.f.f[17]);
OFFSET(KVM_ARCH_FP_F_F18, kvm_cpu_context, fp.f.f[18]);
OFFSET(KVM_ARCH_FP_F_F19, kvm_cpu_context, fp.f.f[19]);
OFFSET(KVM_ARCH_FP_F_F20, kvm_cpu_context, fp.f.f[20]);
OFFSET(KVM_ARCH_FP_F_F21, kvm_cpu_context, fp.f.f[21]);
OFFSET(KVM_ARCH_FP_F_F22, kvm_cpu_context, fp.f.f[22]);
OFFSET(KVM_ARCH_FP_F_F23, kvm_cpu_context, fp.f.f[23]);
OFFSET(KVM_ARCH_FP_F_F24, kvm_cpu_context, fp.f.f[24]);
OFFSET(KVM_ARCH_FP_F_F25, kvm_cpu_context, fp.f.f[25]);
OFFSET(KVM_ARCH_FP_F_F26, kvm_cpu_context, fp.f.f[26]);
OFFSET(KVM_ARCH_FP_F_F27, kvm_cpu_context, fp.f.f[27]);
OFFSET(KVM_ARCH_FP_F_F28, kvm_cpu_context, fp.f.f[28]);
OFFSET(KVM_ARCH_FP_F_F29, kvm_cpu_context, fp.f.f[29]);
OFFSET(KVM_ARCH_FP_F_F30, kvm_cpu_context, fp.f.f[30]);
OFFSET(KVM_ARCH_FP_F_F31, kvm_cpu_context, fp.f.f[31]);
OFFSET(KVM_ARCH_FP_F_FCSR, kvm_cpu_context, fp.f.fcsr);
/* D extension */
OFFSET(KVM_ARCH_FP_D_F0, kvm_cpu_context, fp.d.f[0]);
OFFSET(KVM_ARCH_FP_D_F1, kvm_cpu_context, fp.d.f[1]);
OFFSET(KVM_ARCH_FP_D_F2, kvm_cpu_context, fp.d.f[2]);
OFFSET(KVM_ARCH_FP_D_F3, kvm_cpu_context, fp.d.f[3]);
OFFSET(KVM_ARCH_FP_D_F4, kvm_cpu_context, fp.d.f[4]);
OFFSET(KVM_ARCH_FP_D_F5, kvm_cpu_context, fp.d.f[5]);
OFFSET(KVM_ARCH_FP_D_F6, kvm_cpu_context, fp.d.f[6]);
OFFSET(KVM_ARCH_FP_D_F7, kvm_cpu_context, fp.d.f[7]);
OFFSET(KVM_ARCH_FP_D_F8, kvm_cpu_context, fp.d.f[8]);
OFFSET(KVM_ARCH_FP_D_F9, kvm_cpu_context, fp.d.f[9]);
OFFSET(KVM_ARCH_FP_D_F10, kvm_cpu_context, fp.d.f[10]);
OFFSET(KVM_ARCH_FP_D_F11, kvm_cpu_context, fp.d.f[11]);
OFFSET(KVM_ARCH_FP_D_F12, kvm_cpu_context, fp.d.f[12]);
OFFSET(KVM_ARCH_FP_D_F13, kvm_cpu_context, fp.d.f[13]);
OFFSET(KVM_ARCH_FP_D_F14, kvm_cpu_context, fp.d.f[14]);
OFFSET(KVM_ARCH_FP_D_F15, kvm_cpu_context, fp.d.f[15]);
OFFSET(KVM_ARCH_FP_D_F16, kvm_cpu_context, fp.d.f[16]);
OFFSET(KVM_ARCH_FP_D_F17, kvm_cpu_context, fp.d.f[17]);
OFFSET(KVM_ARCH_FP_D_F18, kvm_cpu_context, fp.d.f[18]);
OFFSET(KVM_ARCH_FP_D_F19, kvm_cpu_context, fp.d.f[19]);
OFFSET(KVM_ARCH_FP_D_F20, kvm_cpu_context, fp.d.f[20]);
OFFSET(KVM_ARCH_FP_D_F21, kvm_cpu_context, fp.d.f[21]);
OFFSET(KVM_ARCH_FP_D_F22, kvm_cpu_context, fp.d.f[22]);
OFFSET(KVM_ARCH_FP_D_F23, kvm_cpu_context, fp.d.f[23]);
OFFSET(KVM_ARCH_FP_D_F24, kvm_cpu_context, fp.d.f[24]);
OFFSET(KVM_ARCH_FP_D_F25, kvm_cpu_context, fp.d.f[25]);
OFFSET(KVM_ARCH_FP_D_F26, kvm_cpu_context, fp.d.f[26]);
OFFSET(KVM_ARCH_FP_D_F27, kvm_cpu_context, fp.d.f[27]);
OFFSET(KVM_ARCH_FP_D_F28, kvm_cpu_context, fp.d.f[28]);
OFFSET(KVM_ARCH_FP_D_F29, kvm_cpu_context, fp.d.f[29]);
OFFSET(KVM_ARCH_FP_D_F30, kvm_cpu_context, fp.d.f[30]);
OFFSET(KVM_ARCH_FP_D_F31, kvm_cpu_context, fp.d.f[31]);
OFFSET(KVM_ARCH_FP_D_FCSR, kvm_cpu_context, fp.d.fcsr);
/*
* THREAD_{F,X}* might be larger than a S-type offset can handle, but
* these are used in performance-sensitive assembly so we can't resort
* to loading the long immediate every time.
*/
DEFINE(TASK_THREAD_RA_RA,
offsetof(struct task_struct, thread.ra)
- offsetof(struct task_struct, thread.ra)
);
DEFINE(TASK_THREAD_SP_RA,
offsetof(struct task_struct, thread.sp)
- offsetof(struct task_struct, thread.ra)
);
DEFINE(TASK_THREAD_S0_RA,
offsetof(struct task_struct, thread.s[0])
- offsetof(struct task_struct, thread.ra)
);
DEFINE(TASK_THREAD_S1_RA,
offsetof(struct task_struct, thread.s[1])
- offsetof(struct task_struct, thread.ra)
);
DEFINE(TASK_THREAD_S2_RA,
offsetof(struct task_struct, thread.s[2])
- offsetof(struct task_struct, thread.ra)
);
DEFINE(TASK_THREAD_S3_RA,
offsetof(struct task_struct, thread.s[3])
- offsetof(struct task_struct, thread.ra)
);
DEFINE(TASK_THREAD_S4_RA,
offsetof(struct task_struct, thread.s[4])
- offsetof(struct task_struct, thread.ra)
);
DEFINE(TASK_THREAD_S5_RA,
offsetof(struct task_struct, thread.s[5])
- offsetof(struct task_struct, thread.ra)
);
DEFINE(TASK_THREAD_S6_RA,
offsetof(struct task_struct, thread.s[6])
- offsetof(struct task_struct, thread.ra)
);
DEFINE(TASK_THREAD_S7_RA,
offsetof(struct task_struct, thread.s[7])
- offsetof(struct task_struct, thread.ra)
);
DEFINE(TASK_THREAD_S8_RA,
offsetof(struct task_struct, thread.s[8])
- offsetof(struct task_struct, thread.ra)
);
DEFINE(TASK_THREAD_S9_RA,
offsetof(struct task_struct, thread.s[9])
- offsetof(struct task_struct, thread.ra)
);
DEFINE(TASK_THREAD_S10_RA,
offsetof(struct task_struct, thread.s[10])
- offsetof(struct task_struct, thread.ra)
);
DEFINE(TASK_THREAD_S11_RA,
offsetof(struct task_struct, thread.s[11])
- offsetof(struct task_struct, thread.ra)
);
DEFINE(TASK_THREAD_F0_F0,
offsetof(struct task_struct, thread.fstate.f[0])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F1_F0,
offsetof(struct task_struct, thread.fstate.f[1])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F2_F0,
offsetof(struct task_struct, thread.fstate.f[2])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F3_F0,
offsetof(struct task_struct, thread.fstate.f[3])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F4_F0,
offsetof(struct task_struct, thread.fstate.f[4])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F5_F0,
offsetof(struct task_struct, thread.fstate.f[5])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F6_F0,
offsetof(struct task_struct, thread.fstate.f[6])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F7_F0,
offsetof(struct task_struct, thread.fstate.f[7])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F8_F0,
offsetof(struct task_struct, thread.fstate.f[8])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F9_F0,
offsetof(struct task_struct, thread.fstate.f[9])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F10_F0,
offsetof(struct task_struct, thread.fstate.f[10])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F11_F0,
offsetof(struct task_struct, thread.fstate.f[11])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F12_F0,
offsetof(struct task_struct, thread.fstate.f[12])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F13_F0,
offsetof(struct task_struct, thread.fstate.f[13])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F14_F0,
offsetof(struct task_struct, thread.fstate.f[14])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F15_F0,
offsetof(struct task_struct, thread.fstate.f[15])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F16_F0,
offsetof(struct task_struct, thread.fstate.f[16])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F17_F0,
offsetof(struct task_struct, thread.fstate.f[17])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F18_F0,
offsetof(struct task_struct, thread.fstate.f[18])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F19_F0,
offsetof(struct task_struct, thread.fstate.f[19])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F20_F0,
offsetof(struct task_struct, thread.fstate.f[20])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F21_F0,
offsetof(struct task_struct, thread.fstate.f[21])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F22_F0,
offsetof(struct task_struct, thread.fstate.f[22])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F23_F0,
offsetof(struct task_struct, thread.fstate.f[23])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F24_F0,
offsetof(struct task_struct, thread.fstate.f[24])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F25_F0,
offsetof(struct task_struct, thread.fstate.f[25])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F26_F0,
offsetof(struct task_struct, thread.fstate.f[26])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F27_F0,
offsetof(struct task_struct, thread.fstate.f[27])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F28_F0,
offsetof(struct task_struct, thread.fstate.f[28])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F29_F0,
offsetof(struct task_struct, thread.fstate.f[29])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F30_F0,
offsetof(struct task_struct, thread.fstate.f[30])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_F31_F0,
offsetof(struct task_struct, thread.fstate.f[31])
- offsetof(struct task_struct, thread.fstate.f[0])
);
DEFINE(TASK_THREAD_FCSR_F0,
offsetof(struct task_struct, thread.fstate.fcsr)
- offsetof(struct task_struct, thread.fstate.f[0])
);
/*
* We allocate a pt_regs on the stack when entering the kernel. This
* ensures the alignment is sane.
*/
DEFINE(PT_SIZE_ON_STACK, ALIGN(sizeof(struct pt_regs), STACK_ALIGN));
OFFSET(KERNEL_MAP_VIRT_ADDR, kernel_mapping, virt_addr);
OFFSET(SBI_HART_BOOT_TASK_PTR_OFFSET, sbi_hart_boot_data, task_ptr);
OFFSET(SBI_HART_BOOT_STACK_PTR_OFFSET, sbi_hart_boot_data, stack_ptr);
DEFINE(STACKFRAME_SIZE_ON_STACK, ALIGN(sizeof(struct stackframe), STACK_ALIGN));
OFFSET(STACKFRAME_FP, stackframe, fp);
OFFSET(STACKFRAME_RA, stackframe, ra);
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
ftrace: Make ftrace_regs abstract from direct use ftrace_regs was created to hold registers that store information to save function parameters, return value and stack. Since it is a subset of pt_regs, it should only be used by its accessor functions. But because pt_regs can easily be taken from ftrace_regs (on most archs), it is tempting to use it directly. But when running on other architectures, it may fail to build or worse, build but crash the kernel! Instead, make struct ftrace_regs an empty structure and have the architectures define __arch_ftrace_regs and all the accessor functions will typecast to it to get to the actual fields. This will help avoid usage of ftrace_regs directly. Link: https://lore.kernel.org/all/20241007171027.629bdafd@gandalf.local.home/ Cc: "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org> Cc: "x86@kernel.org" <x86@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Naveen N Rao <naveen@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/20241008230628.958778821@goodmis.org Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390 Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-08 19:05:28 -04:00
DEFINE(FREGS_SIZE_ON_STACK, ALIGN(sizeof(struct __arch_ftrace_regs), STACK_ALIGN));
DEFINE(FREGS_EPC, offsetof(struct __arch_ftrace_regs, epc));
DEFINE(FREGS_RA, offsetof(struct __arch_ftrace_regs, ra));
DEFINE(FREGS_SP, offsetof(struct __arch_ftrace_regs, sp));
DEFINE(FREGS_S0, offsetof(struct __arch_ftrace_regs, s0));
DEFINE(FREGS_T1, offsetof(struct __arch_ftrace_regs, t1));
DEFINE(FREGS_A0, offsetof(struct __arch_ftrace_regs, a0));
DEFINE(FREGS_A1, offsetof(struct __arch_ftrace_regs, a1));
DEFINE(FREGS_A2, offsetof(struct __arch_ftrace_regs, a2));
DEFINE(FREGS_A3, offsetof(struct __arch_ftrace_regs, a3));
DEFINE(FREGS_A4, offsetof(struct __arch_ftrace_regs, a4));
DEFINE(FREGS_A5, offsetof(struct __arch_ftrace_regs, a5));
DEFINE(FREGS_A6, offsetof(struct __arch_ftrace_regs, a6));
DEFINE(FREGS_A7, offsetof(struct __arch_ftrace_regs, a7));
ftrace: riscv: move from REGS to ARGS This commit replaces riscv's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop relying on stop_machine() for RISCV's implementation of ftrace. The main relevant benefit that this change will bring for the above use-case is that now we don't have separate ftrace_caller and ftrace_regs_caller trampolines. This will allow the callsite to call ftrace_caller by modifying a single instruction. Now the callsite can do something similar to: When not tracing: | When tracing: func: func: auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top nop <=========<Enable/Disable>=========> jalr t0, ftrace_caller_bottom [...] [...] The above assumes that we are dropping the support of calling a direct trampoline from the callsite. We need to drop this as the callsite can't change the target address to call, it can only enable/disable a call to a preset target (ftrace_caller in the above diagram). We can later optimize this by calling an intermediate dispatcher trampoline before ftrace_caller. Currently, ftrace_regs_caller saves all CPU registers in the format of struct pt_regs and allows the tracer to modify them. We don't need to save all of the CPU registers because at function entry only a subset of pt_regs is live: |----------+----------+---------------------------------------------| | Register | ABI Name | Description | |----------+----------+---------------------------------------------| | x1 | ra | Return address for traced function | | x2 | sp | Stack pointer | | x5 | t0 | Return address for ftrace_caller trampoline | | x8 | s0/fp | Frame pointer | | x10-11 | a0-1 | Function arguments/return values | | x12-17 | a2-7 | Function arguments | |----------+----------+---------------------------------------------| See RISCV calling convention[1] for the above table. Saving just the live registers decreases the amount of stack space required from 288 Bytes to 112 Bytes. Basic testing was done with this on the VisionFive 2 development board. Note: - Moving from REGS to ARGS will mean that RISCV will stop supporting KPROBES_ON_FTRACE as it requires full pt_regs to be saved. - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2]. [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf [2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-05 14:24:53 +00:00
#endif
}