2019-06-03 07:44:50 +02:00
|
|
|
// SPDX-License-Identifier: GPL-2.0-only
|
2012-03-05 11:49:31 +00:00
|
|
|
/*
|
|
|
|
* Based on arch/arm/kernel/signal.c
|
|
|
|
*
|
|
|
|
* Copyright (C) 1995-2009 Russell King
|
|
|
|
* Copyright (C) 2012 ARM Ltd.
|
|
|
|
*/
|
|
|
|
|
arm64: signal: Report signal frame size to userspace via auxv
Stateful CPU architecture extensions may require the signal frame
to grow to a size that exceeds the arch's MINSIGSTKSZ #define.
However, changing this #define is an ABI break.
To allow userspace the option of determining the signal frame size
in a more forwards-compatible way, this patch adds a new auxv entry
tagged with AT_MINSIGSTKSZ, which provides the maximum signal frame
size that the process can observe during its lifetime.
If AT_MINSIGSTKSZ is absent from the aux vector, the caller can
assume that the MINSIGSTKSZ #define is sufficient. This allows for
a consistent interface with older kernels that do not provide
AT_MINSIGSTKSZ.
The idea is that libc could expose this via sysconf() or some
similar mechanism.
There is deliberately no AT_SIGSTKSZ. The kernel knows nothing
about userspace's own stack overheads and should not pretend to
know.
For arm64:
The primary motivation for this interface is the Scalable Vector
Extension, which can require at least 4KB or so of extra space
in the signal frame for the largest hardware implementations.
To determine the correct value, a "Christmas tree" mode (via the
add_all argument) is added to setup_sigframe_layout(), to simulate
addition of all possible records to the signal frame at maximum
possible size.
If this procedure goes wrong somehow, resulting in a stupidly large
frame layout and hence failure of sigframe_alloc() to allocate a
record to the frame, then this is indicative of a kernel bug. In
this case, we WARN() and no attempt is made to populate
AT_MINSIGSTKSZ for userspace.
For arm64 SVE:
The SVE context block in the signal frame needs to be considered
too when computing the maximum possible signal frame size.
Because the size of this block depends on the vector length, this
patch computes the size based not on the thread's current vector
length but instead on the maximum possible vector length: this
determines the maximum size of SVE context block that can be
observed in any signal frame for the lifetime of the process.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-06-01 11:10:14 +01:00
|
|
|
#include <linux/cache.h>
|
2014-04-30 10:51:32 +01:00
|
|
|
#include <linux/compat.h>
|
2012-03-05 11:49:31 +00:00
|
|
|
#include <linux/errno.h>
|
2017-06-15 15:03:38 +01:00
|
|
|
#include <linux/kernel.h>
|
2012-03-05 11:49:31 +00:00
|
|
|
#include <linux/signal.h>
|
|
|
|
#include <linux/freezer.h>
|
2017-06-15 15:03:39 +01:00
|
|
|
#include <linux/stddef.h>
|
2012-03-05 11:49:31 +00:00
|
|
|
#include <linux/uaccess.h>
|
2017-06-20 18:23:39 +01:00
|
|
|
#include <linux/sizes.h>
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
#include <linux/string.h>
|
2012-03-05 11:49:31 +00:00
|
|
|
#include <linux/ratelimit.h>
|
2024-02-06 12:38:47 +00:00
|
|
|
#include <linux/rseq.h>
|
2017-06-14 18:12:03 -07:00
|
|
|
#include <linux/syscalls.h>
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
#include <linux/pkeys.h>
|
2012-03-05 11:49:31 +00:00
|
|
|
|
2017-11-02 12:12:37 +00:00
|
|
|
#include <asm/daifflags.h>
|
2012-03-05 11:49:31 +00:00
|
|
|
#include <asm/debug-monitors.h>
|
|
|
|
#include <asm/elf.h>
|
2023-05-16 18:06:40 +02:00
|
|
|
#include <asm/exception.h>
|
2012-03-05 11:49:31 +00:00
|
|
|
#include <asm/cacheflush.h>
|
2024-10-01 23:59:04 +01:00
|
|
|
#include <asm/gcs.h>
|
2012-03-05 11:49:31 +00:00
|
|
|
#include <asm/ucontext.h>
|
|
|
|
#include <asm/unistd.h>
|
|
|
|
#include <asm/fpsimd.h>
|
2017-08-01 15:35:54 +01:00
|
|
|
#include <asm/ptrace.h>
|
arm64: fix compat syscall return truncation
Due to inconsistencies in the way we manipulate compat GPRs, we have a
few issues today:
* For audit and tracing, where error codes are handled as a (native)
long, negative error codes are expected to be sign-extended to the
native 64-bits, or they may fail to be matched correctly. Thus a
syscall which fails with an error may erroneously be identified as
failing.
* For ptrace, *all* compat return values should be sign-extended for
consistency with 32-bit arm, but we currently only do this for
negative return codes.
* As we may transiently set the upper 32 bits of some compat GPRs while
in the kernel, these can be sampled by perf, which is somewhat
confusing. This means that where a syscall returns a pointer above 2G,
this will be sign-extended, but will not be mistaken for an error as
error codes are constrained to the inclusive range [-4096, -1] where
no user pointer can exist.
To fix all of these, we must consistently use helpers to get/set the
compat GPRs, ensuring that we never write the upper 32 bits of the
return code, and always sign-extend when reading the return code. This
patch does so, with the following changes:
* We re-organise syscall_get_return_value() to always sign-extend for
compat tasks, and reimplement syscall_get_error() atop. We update
syscall_trace_exit() to use syscall_get_return_value().
* We consistently use syscall_set_return_value() to set the return
value, ensureing the upper 32 bits are never set unexpectedly.
* As the core audit code currently uses regs_return_value() rather than
syscall_get_return_value(), we special-case this for
compat_user_mode(regs) such that this will do the right thing. Going
forward, we should try to move the core audit code over to
syscall_get_return_value().
Cc: <stable@vger.kernel.org>
Reported-by: He Zhe <zhe.he@windriver.com>
Reported-by: weiyuchen <weiyuchen3@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210802104200.21390-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-08-02 11:42:00 +01:00
|
|
|
#include <asm/syscall.h>
|
2012-03-05 11:49:31 +00:00
|
|
|
#include <asm/signal32.h>
|
2018-02-20 15:05:17 +00:00
|
|
|
#include <asm/traps.h>
|
2012-03-05 11:49:31 +00:00
|
|
|
#include <asm/vdso.h>
|
|
|
|
|
2024-10-01 23:59:04 +01:00
|
|
|
#define GCS_SIGNAL_CAP(addr) (((unsigned long)addr) & GCS_CAP_ADDR_MASK)
|
|
|
|
|
2012-03-05 11:49:31 +00:00
|
|
|
/*
|
|
|
|
* Do a signal return; undo the signal stack. These are aligned to 128-bit.
|
|
|
|
*/
|
|
|
|
struct rt_sigframe {
|
|
|
|
struct siginfo info;
|
|
|
|
struct ucontext uc;
|
2017-06-15 15:03:38 +01:00
|
|
|
};
|
|
|
|
|
|
|
|
struct rt_sigframe_user_layout {
|
|
|
|
struct rt_sigframe __user *sigframe;
|
|
|
|
struct frame_record __user *next_frame;
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
|
|
|
|
unsigned long size; /* size of allocated sigframe data */
|
|
|
|
unsigned long limit; /* largest allowed size */
|
|
|
|
|
|
|
|
unsigned long fpsimd_offset;
|
|
|
|
unsigned long esr_offset;
|
2024-10-01 23:59:05 +01:00
|
|
|
unsigned long gcs_offset;
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
unsigned long sve_offset;
|
2022-12-27 14:20:41 +00:00
|
|
|
unsigned long tpidr2_offset;
|
2022-04-19 12:22:27 +01:00
|
|
|
unsigned long za_offset;
|
2023-01-16 16:04:46 +00:00
|
|
|
unsigned long zt_offset;
|
2024-03-06 23:14:49 +00:00
|
|
|
unsigned long fpmr_offset;
|
2024-08-22 16:11:02 +01:00
|
|
|
unsigned long poe_offset;
|
2017-06-20 18:23:39 +01:00
|
|
|
unsigned long extra_offset;
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
unsigned long end_offset;
|
2017-06-15 15:03:38 +01:00
|
|
|
};
|
|
|
|
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
/*
|
|
|
|
* Holds any EL0-controlled state that influences unprivileged memory accesses.
|
|
|
|
* This includes both accesses done in userspace and uaccess done in the kernel.
|
|
|
|
*
|
|
|
|
* This state needs to be carefully managed to ensure that it doesn't cause
|
|
|
|
* uaccess to fail when setting up the signal frame, and the signal handler
|
|
|
|
* itself also expects a well-defined state when entered.
|
|
|
|
*/
|
|
|
|
struct user_access_state {
|
|
|
|
u64 por_el0;
|
|
|
|
};
|
|
|
|
|
2017-06-20 18:23:39 +01:00
|
|
|
#define TERMINATOR_SIZE round_up(sizeof(struct _aarch64_ctx), 16)
|
|
|
|
#define EXTRA_CONTEXT_SIZE round_up(sizeof(struct extra_context), 16)
|
|
|
|
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
/*
|
|
|
|
* Save the user access state into ua_state and reset it to disable any
|
|
|
|
* restrictions.
|
|
|
|
*/
|
|
|
|
static void save_reset_user_access_state(struct user_access_state *ua_state)
|
|
|
|
{
|
|
|
|
if (system_supports_poe()) {
|
|
|
|
u64 por_enable_all = 0;
|
|
|
|
|
|
|
|
for (int pkey = 0; pkey < arch_max_pkey(); pkey++)
|
2025-02-19 16:40:28 +00:00
|
|
|
por_enable_all |= POR_ELx_PERM_PREP(pkey, POE_RWX);
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
|
|
|
|
ua_state->por_el0 = read_sysreg_s(SYS_POR_EL0);
|
|
|
|
write_sysreg_s(por_enable_all, SYS_POR_EL0);
|
2025-06-19 17:00:42 +01:00
|
|
|
/*
|
|
|
|
* No ISB required as we can tolerate spurious Overlay faults -
|
|
|
|
* the fault handler will check again based on the new value
|
|
|
|
* of POR_EL0.
|
|
|
|
*/
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Set the user access state for invoking the signal handler.
|
|
|
|
*
|
|
|
|
* No uaccess should be done after that function is called.
|
|
|
|
*/
|
|
|
|
static void set_handler_user_access_state(void)
|
|
|
|
{
|
|
|
|
if (system_supports_poe())
|
|
|
|
write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Restore the user access state to the values saved in ua_state.
|
|
|
|
*
|
|
|
|
* No uaccess should be done after that function is called.
|
|
|
|
*/
|
|
|
|
static void restore_user_access_state(const struct user_access_state *ua_state)
|
|
|
|
{
|
|
|
|
if (system_supports_poe())
|
|
|
|
write_sysreg_s(ua_state->por_el0, SYS_POR_EL0);
|
|
|
|
}
|
|
|
|
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
static void init_user_layout(struct rt_sigframe_user_layout *user)
|
|
|
|
{
|
2017-06-20 18:23:39 +01:00
|
|
|
const size_t reserved_size =
|
|
|
|
sizeof(user->sigframe->uc.uc_mcontext.__reserved);
|
|
|
|
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
memset(user, 0, sizeof(*user));
|
|
|
|
user->size = offsetof(struct rt_sigframe, uc.uc_mcontext.__reserved);
|
|
|
|
|
2017-06-20 18:23:39 +01:00
|
|
|
user->limit = user->size + reserved_size;
|
|
|
|
|
|
|
|
user->limit -= TERMINATOR_SIZE;
|
|
|
|
user->limit -= EXTRA_CONTEXT_SIZE;
|
|
|
|
/* Reserve space for extension and terminator ^ */
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
static size_t sigframe_size(struct rt_sigframe_user_layout const *user)
|
|
|
|
{
|
|
|
|
return round_up(max(user->size, sizeof(struct rt_sigframe)), 16);
|
|
|
|
}
|
|
|
|
|
2017-06-20 18:23:39 +01:00
|
|
|
/*
|
|
|
|
* Sanity limit on the approximate maximum size of signal frame we'll
|
|
|
|
* try to generate. Stack alignment padding and the frame record are
|
|
|
|
* not taken into account. This limit is not a guarantee and is
|
|
|
|
* NOT ABI.
|
|
|
|
*/
|
2022-08-17 19:23:21 +01:00
|
|
|
#define SIGFRAME_MAXSZ SZ_256K
|
2017-06-20 18:23:39 +01:00
|
|
|
|
|
|
|
static int __sigframe_alloc(struct rt_sigframe_user_layout *user,
|
|
|
|
unsigned long *offset, size_t size, bool extend)
|
|
|
|
{
|
|
|
|
size_t padded_size = round_up(size, 16);
|
|
|
|
|
|
|
|
if (padded_size > user->limit - user->size &&
|
|
|
|
!user->extra_offset &&
|
|
|
|
extend) {
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
user->limit += EXTRA_CONTEXT_SIZE;
|
|
|
|
ret = __sigframe_alloc(user, &user->extra_offset,
|
|
|
|
sizeof(struct extra_context), false);
|
|
|
|
if (ret) {
|
|
|
|
user->limit -= EXTRA_CONTEXT_SIZE;
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Reserve space for the __reserved[] terminator */
|
|
|
|
user->size += TERMINATOR_SIZE;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Allow expansion up to SIGFRAME_MAXSZ, ensuring space for
|
|
|
|
* the terminator:
|
|
|
|
*/
|
|
|
|
user->limit = SIGFRAME_MAXSZ - TERMINATOR_SIZE;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Still not enough space? Bad luck! */
|
|
|
|
if (padded_size > user->limit - user->size)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
|
|
|
*offset = user->size;
|
|
|
|
user->size += padded_size;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2017-06-15 15:03:41 +01:00
|
|
|
/*
|
|
|
|
* Allocate space for an optional record of <size> bytes in the user
|
|
|
|
* signal frame. The offset from the signal frame base address to the
|
|
|
|
* allocated block is assigned to *offset.
|
|
|
|
*/
|
|
|
|
static int sigframe_alloc(struct rt_sigframe_user_layout *user,
|
|
|
|
unsigned long *offset, size_t size)
|
|
|
|
{
|
2017-06-20 18:23:39 +01:00
|
|
|
return __sigframe_alloc(user, offset, size, true);
|
|
|
|
}
|
2017-06-15 15:03:41 +01:00
|
|
|
|
2017-06-20 18:23:39 +01:00
|
|
|
/* Allocate the null terminator record and prevent further allocations */
|
|
|
|
static int sigframe_alloc_end(struct rt_sigframe_user_layout *user)
|
|
|
|
{
|
|
|
|
int ret;
|
2017-06-15 15:03:41 +01:00
|
|
|
|
2017-06-20 18:23:39 +01:00
|
|
|
/* Un-reserve the space reserved for the terminator: */
|
|
|
|
user->limit += TERMINATOR_SIZE;
|
|
|
|
|
|
|
|
ret = sigframe_alloc(user, &user->end_offset,
|
|
|
|
sizeof(struct _aarch64_ctx));
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
/* Prevent further allocation: */
|
|
|
|
user->limit = user->size;
|
2017-06-15 15:03:41 +01:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
static void __user *apply_user_offset(
|
|
|
|
struct rt_sigframe_user_layout const *user, unsigned long offset)
|
|
|
|
{
|
|
|
|
char __user *base = (char __user *)user->sigframe;
|
|
|
|
|
|
|
|
return base + offset;
|
|
|
|
}
|
|
|
|
|
2023-01-31 22:20:41 +00:00
|
|
|
struct user_ctxs {
|
|
|
|
struct fpsimd_context __user *fpsimd;
|
2023-01-31 22:20:42 +00:00
|
|
|
u32 fpsimd_size;
|
2023-01-31 22:20:41 +00:00
|
|
|
struct sve_context __user *sve;
|
2023-01-31 22:20:42 +00:00
|
|
|
u32 sve_size;
|
2023-01-31 22:20:41 +00:00
|
|
|
struct tpidr2_context __user *tpidr2;
|
2023-01-31 22:20:42 +00:00
|
|
|
u32 tpidr2_size;
|
2023-01-31 22:20:41 +00:00
|
|
|
struct za_context __user *za;
|
2023-01-31 22:20:42 +00:00
|
|
|
u32 za_size;
|
2023-01-31 22:20:41 +00:00
|
|
|
struct zt_context __user *zt;
|
2023-01-31 22:20:42 +00:00
|
|
|
u32 zt_size;
|
2024-03-06 23:14:49 +00:00
|
|
|
struct fpmr_context __user *fpmr;
|
|
|
|
u32 fpmr_size;
|
2024-08-22 16:11:02 +01:00
|
|
|
struct poe_context __user *poe;
|
|
|
|
u32 poe_size;
|
2024-10-01 23:59:05 +01:00
|
|
|
struct gcs_context __user *gcs;
|
|
|
|
u32 gcs_size;
|
2023-01-31 22:20:41 +00:00
|
|
|
};
|
|
|
|
|
2012-03-05 11:49:31 +00:00
|
|
|
static int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
|
|
|
|
{
|
2018-03-28 10:50:49 +01:00
|
|
|
struct user_fpsimd_state const *fpsimd =
|
|
|
|
¤t->thread.uw.fpsimd_state;
|
2012-03-05 11:49:31 +00:00
|
|
|
int err;
|
|
|
|
|
arm64/fpsimd: Clarify sve_sync_*() functions
The sve_sync_{to,from}_fpsimd*() functions are intended to
extract/insert the currently effective FPSIMD state of a task regardless
of whether the task's state is saved in FPSIMD format or SVE format.
Historically they were only used by ptrace, but sve_sync_to_fpsimd() is
now used more widely, and sve_sync_from_fpsimd_zeropad() may be used
more widely in future.
When FPSIMD/SVE state tracking was changed across commits:
baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")
a0136be443d5 (arm64/fpsimd: Load FP state based on recorded data type")
bbc6172eefdb ("arm64/fpsimd: SME no longer requires SVE register state")
8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")
... sve_sync_to_fpsimd() was updated to consider task->thread.fp_type
rather than the task's TIF_SVE and PSTATE.SM, but (apparently due to an
oversight) sve_sync_from_fpsimd_zeropad() was left as-is, leaving the
two inconsistent.
Due to this, sve_sync_from_fpsimd_zeropad() may copy state from
task->thread.uw.fpsimd_state into task->thread.sve_state when
task->thread.fp_type == FP_STATE_FPSIMD. This is redundant (but benign)
as task->thread.uw.fpsimd_state is the effective state that will be
restored, and task->thread.sve_state will not be consumed. For
consistency, and to avoid the redundant work, it better for
sve_sync_from_fpsimd_zeropad() to consider task->thread.fp_type alone,
matching sve_sync_to_fpsimd().
The naming of both functions is somehat unfortunate, as it is unclear
when and why they copy state. It would be better to describe them in
terms of the effective state.
Considering all of the above, clean this up:
* Adjust sve_sync_from_fpsimd_zeropad() to consider
task->thread.fp_type.
* Update comments to clarify the intended semantics/usage. I've removed
the description that task->thread.sve_state must have been allocated,
as this is only necessary when task->thread.fp_type == FP_STATE_SVE,
which itself implies that task->thread.sve_state must have been
allocated.
* Rename the functions to more clearly indicate when/why they copy
state:
- sve_sync_to_fpsimd() => fpsimd_sync_from_effective_state()
- sve_sync_from_fpsimd_zeropad => fpsimd_sync_to_effective_state_zeropad()
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250508132644.1395904-7-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-05-08 14:26:26 +01:00
|
|
|
fpsimd_sync_from_effective_state(current);
|
arm64/fpsimd: signal: Always save+flush state early
There are several issues with the way the native signal handling code
manipulates FPSIMD/SVE/SME state, described in detail below. These
issues largely result from races with preemption and inconsistent
handling of live state vs saved state.
Known issues with native FPSIMD/SVE/SME state management include:
* On systems with FPMR, the code to save/restore the FPMR accesses the
register while it is not owned by the current task. Consequently, this
may corrupt the FPMR of the current task and/or may corrupt the FPMR
of an unrelated task. The FPMR save/restore has been broken since it
was introduced in commit:
8c46def44409fc91 ("arm64/signal: Add FPMR signal handling")
* On systems with SME, setup_return() modifies both the live register
state and the saved state register state regardless of whether the
task's state is live, and without holding the cpu fpsimd context.
Consequently:
- This may corrupt the state an unrelated task which has PSTATE.SM set
and/or PSTATE.ZA set.
- The task may enter the signal handler in streaming mode, and or with
ZA storage enabled unexpectedly.
- The task may enter the signal handler in non-streaming SVE mode with
stale SVE register state, which may have been inherited from
streaming SVE mode unexpectedly. Where the streaming and
non-streaming vector lengths differ, this may be packed into
registers arbitrarily.
This logic has been broken since it was introduced in commit:
40a8e87bb32855b3 ("arm64/sme: Disable ZA and streaming mode when handling signals")
Further incorrect manipulation of state was added in commits:
ea64baacbc36a0d5 ("arm64/signal: Flush FPSIMD register state when disabling streaming mode")
baa8515281b30861 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")
* Several restoration functions use fpsimd_flush_task_state() to discard
the live FPSIMD/SVE/SME while the in-memory copy is stale.
When a subset of the FPSIMD/SVE/SME state is restored, the remainder
may be non-deterministically reset to a stale snapshot from some
arbitrary point in the past.
This non-deterministic discarding was introduced in commit:
8cd969d28fd2848d ("arm64/sve: Signal handling support")
As of that commit, when TIF_SVE was initially clear, failure to
restore the SVE signal frame could reset the FPSIMD registers to a
stale snapshot.
The pattern of discarding unsaved state was subsequently copied into
restoration functions for some new state in commits:
39782210eb7e8763 ("arm64/sme: Implement ZA signal handling")
ee072cf708048c0d ("arm64/sme: Implement signal handling for ZT")
* On systems with SME/SME2, the entire FPSIMD/SVE/SME state may be
loaded onto the CPU redundantly. Either restore_fpsimd_context() or
restore_sve_fpsimd_context() will load the entire FPSIMD/SVE/SME state
via fpsimd_update_current_state() before restore_za_context() and
restore_zt_context() each discard the state via
fpsimd_flush_task_state().
This is purely redundant work, and not a functional bug.
To fix these issues, rework the native signal handling code to always
save+flush the current task's FPSIMD/SVE/SME state before manipulating
that state. This avoids races with preemption and ensures that state is
manipulated consistently regardless of whether it happened to be live
prior to manipulation. This largely involes:
* Using fpsimd_save_and_flush_current_state() to save+flush the state
for both signal delivery and signal return, before the state is
manipulated in any way.
* Removing fpsimd_signal_preserve_current_state() and updating
preserve_fpsimd_context() to explicitly ensure that the FPSIMD state
is up-to-date, as preserve_fpsimd_context() is the only consumer of
the FPSIMD state during signal delivery.
* Modifying fpsimd_update_current_state() to not reload the FPSIMD state
onto the CPU. Ideally we'd remove fpsimd_update_current_state()
entirely, but I've left that for subsequent patches as there are a
number of of other problems with the FPSIMD<->SVE conversion helpers
that should be addressed at the same time. For now I've removed the
misleading comment.
For setup_return(), we need to decide (for ABI reasons) whether signal
delivery should have all the side-effects of an SMSTOP. For now I've
left a TODO comment, as there are other questions in this area that I'll
address with subsequent patches.
Fixes: 8c46def44409 ("arm64/signal: Add FPMR signal handling")
Fixes: 40a8e87bb328 ("arm64/sme: Disable ZA and streaming mode when handling signals")
Fixes: ea64baacbc36 ("arm64/signal: Flush FPSIMD register state when disabling streaming mode")
Fixes: baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")
Fixes: 8cd969d28fd2 ("arm64/sve: Signal handling support")
Fixes: 39782210eb7e ("arm64/sme: Implement ZA signal handling")
Fixes: ee072cf70804 ("arm64/sme: Implement signal handling for ZT")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250409164010.3480271-13-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-09 17:40:09 +01:00
|
|
|
|
2012-03-05 11:49:31 +00:00
|
|
|
/* copy the FP and status/control registers */
|
|
|
|
err = __copy_to_user(ctx->vregs, fpsimd->vregs, sizeof(fpsimd->vregs));
|
|
|
|
__put_user_error(fpsimd->fpsr, &ctx->fpsr, err);
|
|
|
|
__put_user_error(fpsimd->fpcr, &ctx->fpcr, err);
|
|
|
|
|
|
|
|
/* copy the magic/size information */
|
|
|
|
__put_user_error(FPSIMD_MAGIC, &ctx->head.magic, err);
|
|
|
|
__put_user_error(sizeof(struct fpsimd_context), &ctx->head.size, err);
|
|
|
|
|
|
|
|
return err ? -EFAULT : 0;
|
|
|
|
}
|
|
|
|
|
2025-05-08 14:26:24 +01:00
|
|
|
static int read_fpsimd_context(struct user_fpsimd_state *fpsimd,
|
|
|
|
struct user_ctxs *user)
|
2012-03-05 11:49:31 +00:00
|
|
|
{
|
2025-05-08 14:26:24 +01:00
|
|
|
int err;
|
2012-03-05 11:49:31 +00:00
|
|
|
|
2023-01-31 22:20:39 +00:00
|
|
|
/* check the size information */
|
2023-01-31 22:20:42 +00:00
|
|
|
if (user->fpsimd_size != sizeof(struct fpsimd_context))
|
2012-03-05 11:49:31 +00:00
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
/* copy the FP and status/control registers */
|
2025-05-08 14:26:24 +01:00
|
|
|
err = __copy_from_user(fpsimd->vregs, &(user->fpsimd->vregs),
|
|
|
|
sizeof(fpsimd->vregs));
|
|
|
|
__get_user_error(fpsimd->fpsr, &(user->fpsimd->fpsr), err);
|
|
|
|
__get_user_error(fpsimd->fpcr, &(user->fpsimd->fpcr), err);
|
|
|
|
|
|
|
|
return err ? -EFAULT : 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int restore_fpsimd_context(struct user_ctxs *user)
|
|
|
|
{
|
|
|
|
struct user_fpsimd_state fpsimd;
|
|
|
|
int err;
|
|
|
|
|
|
|
|
err = read_fpsimd_context(&fpsimd, user);
|
|
|
|
if (err)
|
|
|
|
return err;
|
2012-03-05 11:49:31 +00:00
|
|
|
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
clear_thread_flag(TIF_SVE);
|
arm64/fpsimd: signal: Clear PSTATE.SM when restoring FPSIMD frame only
On systems with SVE and/or SME, the kernel will always create SVE and
FPSIMD signal frames when delivering a signal, but a user can manipulate
signal frames such that a signal return only observes an FPSIMD signal
frame. When this happens, restore_fpsimd_context() will restore state
such that fp_type==FP_STATE_FPSIMD, but will leave PSTATE.SM as-is.
It is possible for a user to set PSTATE.SM between syscall entry and
execution of the sigreturn logic (e.g. via ptrace), and consequently the
sigreturn may result in the task having PSTATE.SM==1 and
fp_type==FP_STATE_FPSIMD.
For various reasons it is not legitimate for a task to be in a state
where PSTATE.SM==1 and fp_type==FP_STATE_FPSIMD. Portions of the user
ABI are written with the requirement that streaming SVE state is always
presented in SVE format rather than FPSIMD format, and as there is no
mechanism to permit access to only the FPSIMD subset of streaming SVE
state, streaming SVE state must always be saved and restored in SVE
format.
Fix restore_fpsimd_context() to clear PSTATE.SM when restoring an FPSIMD
signal frame without an SVE signal frame. This matches the current
behaviour when an SVE signal frame is present, but the SVE signal frame
has no register payload (e.g. as is the case on SME-only systems which
lack SVE).
This change should have no effect for applications which do not alter
signal frames (i.e. almost all applications). I do not expect
non-{malicious,buggy} applications to hide the SVE signal frame, but
I've chosen to clear PSTATE.SM rather than mandating the presence of an
SVE signal frame in case there is some legacy (non-SME) usage that I am
not currently aware of.
For context, the SME handling was originally introduced in commit:
85ed24dad290 ("arm64/sme: Implement streaming SVE signal handling")
... and subsequently updated/fixed to handle SME-only systems in commits:
7dde62f0687c ("arm64/signal: Always accept SVE signal frames on SME only systems")
f26cd7372160 ("arm64/signal: Always allocate SVE signal frames on SME only systems")
Fixes: 85ed24dad290 ("arm64/sme: Implement streaming SVE signal handling")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250508132644.1395904-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2025-05-08 14:26:22 +01:00
|
|
|
current->thread.svcr &= ~SVCR_SM_MASK;
|
2022-11-15 09:46:34 +00:00
|
|
|
current->thread.fp_type = FP_STATE_FPSIMD;
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
|
2012-03-05 11:49:31 +00:00
|
|
|
/* load the hardware registers from the fpsimd_state structure */
|
2025-05-08 14:26:24 +01:00
|
|
|
fpsimd_update_current_state(&fpsimd);
|
|
|
|
return 0;
|
2012-03-05 11:49:31 +00:00
|
|
|
}
|
|
|
|
|
2024-03-06 23:14:49 +00:00
|
|
|
static int preserve_fpmr_context(struct fpmr_context __user *ctx)
|
|
|
|
{
|
|
|
|
int err = 0;
|
|
|
|
|
|
|
|
__put_user_error(FPMR_MAGIC, &ctx->head.magic, err);
|
|
|
|
__put_user_error(sizeof(*ctx), &ctx->head.size, err);
|
|
|
|
__put_user_error(current->thread.uw.fpmr, &ctx->fpmr, err);
|
|
|
|
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int restore_fpmr_context(struct user_ctxs *user)
|
|
|
|
{
|
|
|
|
u64 fpmr;
|
|
|
|
int err = 0;
|
|
|
|
|
|
|
|
if (user->fpmr_size != sizeof(*user->fpmr))
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
__get_user_error(fpmr, &user->fpmr->fpmr, err);
|
|
|
|
if (!err)
|
arm64/fpsimd: signal: Always save+flush state early
There are several issues with the way the native signal handling code
manipulates FPSIMD/SVE/SME state, described in detail below. These
issues largely result from races with preemption and inconsistent
handling of live state vs saved state.
Known issues with native FPSIMD/SVE/SME state management include:
* On systems with FPMR, the code to save/restore the FPMR accesses the
register while it is not owned by the current task. Consequently, this
may corrupt the FPMR of the current task and/or may corrupt the FPMR
of an unrelated task. The FPMR save/restore has been broken since it
was introduced in commit:
8c46def44409fc91 ("arm64/signal: Add FPMR signal handling")
* On systems with SME, setup_return() modifies both the live register
state and the saved state register state regardless of whether the
task's state is live, and without holding the cpu fpsimd context.
Consequently:
- This may corrupt the state an unrelated task which has PSTATE.SM set
and/or PSTATE.ZA set.
- The task may enter the signal handler in streaming mode, and or with
ZA storage enabled unexpectedly.
- The task may enter the signal handler in non-streaming SVE mode with
stale SVE register state, which may have been inherited from
streaming SVE mode unexpectedly. Where the streaming and
non-streaming vector lengths differ, this may be packed into
registers arbitrarily.
This logic has been broken since it was introduced in commit:
40a8e87bb32855b3 ("arm64/sme: Disable ZA and streaming mode when handling signals")
Further incorrect manipulation of state was added in commits:
ea64baacbc36a0d5 ("arm64/signal: Flush FPSIMD register state when disabling streaming mode")
baa8515281b30861 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")
* Several restoration functions use fpsimd_flush_task_state() to discard
the live FPSIMD/SVE/SME while the in-memory copy is stale.
When a subset of the FPSIMD/SVE/SME state is restored, the remainder
may be non-deterministically reset to a stale snapshot from some
arbitrary point in the past.
This non-deterministic discarding was introduced in commit:
8cd969d28fd2848d ("arm64/sve: Signal handling support")
As of that commit, when TIF_SVE was initially clear, failure to
restore the SVE signal frame could reset the FPSIMD registers to a
stale snapshot.
The pattern of discarding unsaved state was subsequently copied into
restoration functions for some new state in commits:
39782210eb7e8763 ("arm64/sme: Implement ZA signal handling")
ee072cf708048c0d ("arm64/sme: Implement signal handling for ZT")
* On systems with SME/SME2, the entire FPSIMD/SVE/SME state may be
loaded onto the CPU redundantly. Either restore_fpsimd_context() or
restore_sve_fpsimd_context() will load the entire FPSIMD/SVE/SME state
via fpsimd_update_current_state() before restore_za_context() and
restore_zt_context() each discard the state via
fpsimd_flush_task_state().
This is purely redundant work, and not a functional bug.
To fix these issues, rework the native signal handling code to always
save+flush the current task's FPSIMD/SVE/SME state before manipulating
that state. This avoids races with preemption and ensures that state is
manipulated consistently regardless of whether it happened to be live
prior to manipulation. This largely involes:
* Using fpsimd_save_and_flush_current_state() to save+flush the state
for both signal delivery and signal return, before the state is
manipulated in any way.
* Removing fpsimd_signal_preserve_current_state() and updating
preserve_fpsimd_context() to explicitly ensure that the FPSIMD state
is up-to-date, as preserve_fpsimd_context() is the only consumer of
the FPSIMD state during signal delivery.
* Modifying fpsimd_update_current_state() to not reload the FPSIMD state
onto the CPU. Ideally we'd remove fpsimd_update_current_state()
entirely, but I've left that for subsequent patches as there are a
number of of other problems with the FPSIMD<->SVE conversion helpers
that should be addressed at the same time. For now I've removed the
misleading comment.
For setup_return(), we need to decide (for ABI reasons) whether signal
delivery should have all the side-effects of an SMSTOP. For now I've
left a TODO comment, as there are other questions in this area that I'll
address with subsequent patches.
Fixes: 8c46def44409 ("arm64/signal: Add FPMR signal handling")
Fixes: 40a8e87bb328 ("arm64/sme: Disable ZA and streaming mode when handling signals")
Fixes: ea64baacbc36 ("arm64/signal: Flush FPSIMD register state when disabling streaming mode")
Fixes: baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")
Fixes: 8cd969d28fd2 ("arm64/sve: Signal handling support")
Fixes: 39782210eb7e ("arm64/sme: Implement ZA signal handling")
Fixes: ee072cf70804 ("arm64/sme: Implement signal handling for ZT")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250409164010.3480271-13-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-09 17:40:09 +01:00
|
|
|
current->thread.uw.fpmr = fpmr;
|
2024-03-06 23:14:49 +00:00
|
|
|
|
|
|
|
return err;
|
|
|
|
}
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
static int preserve_poe_context(struct poe_context __user *ctx,
|
|
|
|
const struct user_access_state *ua_state)
|
2024-08-22 16:11:02 +01:00
|
|
|
{
|
|
|
|
int err = 0;
|
|
|
|
|
|
|
|
__put_user_error(POE_MAGIC, &ctx->head.magic, err);
|
|
|
|
__put_user_error(sizeof(*ctx), &ctx->head.size, err);
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
__put_user_error(ua_state->por_el0, &ctx->por_el0, err);
|
2024-08-22 16:11:02 +01:00
|
|
|
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
static int restore_poe_context(struct user_ctxs *user,
|
|
|
|
struct user_access_state *ua_state)
|
2024-08-22 16:11:02 +01:00
|
|
|
{
|
|
|
|
u64 por_el0;
|
|
|
|
int err = 0;
|
|
|
|
|
|
|
|
if (user->poe_size != sizeof(*user->poe))
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
__get_user_error(por_el0, &(user->poe->por_el0), err);
|
|
|
|
if (!err)
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
ua_state->por_el0 = por_el0;
|
2024-08-22 16:11:02 +01:00
|
|
|
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
#ifdef CONFIG_ARM64_SVE
|
|
|
|
|
|
|
|
static int preserve_sve_context(struct sve_context __user *ctx)
|
|
|
|
{
|
|
|
|
int err = 0;
|
|
|
|
u16 reserved[ARRAY_SIZE(ctx->__reserved)];
|
2022-04-19 12:22:26 +01:00
|
|
|
u16 flags = 0;
|
2021-10-19 18:22:11 +01:00
|
|
|
unsigned int vl = task_get_sve_vl(current);
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
unsigned int vq = 0;
|
|
|
|
|
2022-04-19 12:22:26 +01:00
|
|
|
if (thread_sm_enabled(¤t->thread)) {
|
|
|
|
vl = task_get_sme_vl(current);
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
vq = sve_vq_from_vl(vl);
|
2022-04-19 12:22:26 +01:00
|
|
|
flags |= SVE_SIG_FLAG_SM;
|
2024-01-30 15:43:53 +00:00
|
|
|
} else if (current->thread.fp_type == FP_STATE_SVE) {
|
2022-04-19 12:22:26 +01:00
|
|
|
vq = sve_vq_from_vl(vl);
|
|
|
|
}
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
|
|
|
|
memset(reserved, 0, sizeof(reserved));
|
|
|
|
|
|
|
|
__put_user_error(SVE_MAGIC, &ctx->head.magic, err);
|
|
|
|
__put_user_error(round_up(SVE_SIG_CONTEXT_SIZE(vq), 16),
|
|
|
|
&ctx->head.size, err);
|
|
|
|
__put_user_error(vl, &ctx->vl, err);
|
2022-04-19 12:22:26 +01:00
|
|
|
__put_user_error(flags, &ctx->flags, err);
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
BUILD_BUG_ON(sizeof(ctx->__reserved) != sizeof(reserved));
|
|
|
|
err |= __copy_to_user(&ctx->__reserved, reserved, sizeof(reserved));
|
|
|
|
|
|
|
|
if (vq) {
|
|
|
|
err |= __copy_to_user((char __user *)ctx + SVE_SIG_REGS_OFFSET,
|
|
|
|
current->thread.sve_state,
|
|
|
|
SVE_SIG_REGS_SIZE(vq));
|
|
|
|
}
|
|
|
|
|
|
|
|
return err ? -EFAULT : 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int restore_sve_fpsimd_context(struct user_ctxs *user)
|
|
|
|
{
|
2023-01-31 22:20:43 +00:00
|
|
|
int err = 0;
|
2022-04-19 12:22:26 +01:00
|
|
|
unsigned int vl, vq;
|
arm64: fpsimd: Fix state leakage when migrating after sigreturn
When refactoring the sigreturn code to handle SVE, I changed the
sigreturn implementation to store the new FPSIMD state from the
user sigframe into task_struct before reloading the state into the
CPU regs. This makes it easier to convert the data for SVE when
needed.
However, it turns out that the fpsimd_state structure passed into
fpsimd_update_current_state is not fully initialised, so assigning
the structure as a whole corrupts current->thread.fpsimd_state.cpu
with uninitialised data.
This means that if the garbage data written to .cpu happens to be a
valid cpu number, and the task is subsequently migrated to the cpu
identified by the that number, and then tries to enter userspace,
the CPU FPSIMD regs will be assumed to be correct for the task and
not reloaded as they should be. This can result in returning to
userspace with the FPSIMD registers containing data that is stale or
that belongs to another task or to the kernel.
Knowingly handing around a kernel structure that is incompletely
initialised with user data is a potential source of mistakes,
especially across source file boundaries. To help avoid a repeat
of this issue, this patch adapts the relevant internal API to hand
around the user-accessible subset only: struct user_fpsimd_state.
To avoid future surprises, this patch also converts all uses of
struct fpsimd_state that really only access the user subset, to use
struct user_fpsimd_state. A few missing consts are added to
function prototypes for good measure.
Thanks to Will for spotting the cause of the bug here.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-12-15 18:34:38 +00:00
|
|
|
struct user_fpsimd_state fpsimd;
|
2023-01-31 22:20:43 +00:00
|
|
|
u16 user_vl, flags;
|
2025-05-08 14:26:23 +01:00
|
|
|
bool sm;
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
|
2023-01-31 22:20:42 +00:00
|
|
|
if (user->sve_size < sizeof(*user->sve))
|
|
|
|
return -EINVAL;
|
|
|
|
|
2023-01-31 22:20:43 +00:00
|
|
|
__get_user_error(user_vl, &(user->sve->vl), err);
|
|
|
|
__get_user_error(flags, &(user->sve->flags), err);
|
|
|
|
if (err)
|
|
|
|
return err;
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
|
2025-05-08 14:26:23 +01:00
|
|
|
sm = flags & SVE_SIG_FLAG_SM;
|
|
|
|
if (sm) {
|
2022-04-19 12:22:26 +01:00
|
|
|
if (!system_supports_sme())
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
vl = task_get_sme_vl(current);
|
|
|
|
} else {
|
2022-12-27 17:12:05 +00:00
|
|
|
/*
|
|
|
|
* A SME only system use SVE for streaming mode so can
|
|
|
|
* have a SVE formatted context with a zero VL and no
|
|
|
|
* payload data.
|
|
|
|
*/
|
|
|
|
if (!system_supports_sve() && !system_supports_sme())
|
2022-06-24 18:21:08 +01:00
|
|
|
return -EINVAL;
|
|
|
|
|
2022-04-19 12:22:26 +01:00
|
|
|
vl = task_get_sve_vl(current);
|
|
|
|
}
|
|
|
|
|
2023-01-31 22:20:43 +00:00
|
|
|
if (user_vl != vl)
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
return -EINVAL;
|
|
|
|
|
2025-05-08 14:26:23 +01:00
|
|
|
/*
|
|
|
|
* Non-streaming SVE state may be preserved without an SVE payload, in
|
|
|
|
* which case the SVE context only has a header with VL==0, and all
|
|
|
|
* state can be restored from the FPSIMD context.
|
|
|
|
*
|
|
|
|
* Streaming SVE state is always preserved with an SVE payload. For
|
|
|
|
* consistency and robustness, reject restoring streaming SVE state
|
|
|
|
* without an SVE payload.
|
|
|
|
*/
|
2025-05-08 14:26:24 +01:00
|
|
|
if (!sm && user->sve_size == sizeof(*user->sve))
|
|
|
|
return restore_fpsimd_context(user);
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
|
2023-01-31 22:20:43 +00:00
|
|
|
vq = sve_vq_from_vl(vl);
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
|
2023-01-31 22:20:42 +00:00
|
|
|
if (user->sve_size < SVE_SIG_CONTEXT_SIZE(vq))
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
return -EINVAL;
|
|
|
|
|
2022-08-17 19:23:23 +01:00
|
|
|
sve_alloc(current, true);
|
2021-08-24 16:34:17 +01:00
|
|
|
if (!current->thread.sve_state) {
|
|
|
|
clear_thread_flag(TIF_SVE);
|
|
|
|
return -ENOMEM;
|
|
|
|
}
|
|
|
|
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
err = __copy_from_user(current->thread.sve_state,
|
|
|
|
(char __user const *)user->sve +
|
|
|
|
SVE_SIG_REGS_OFFSET,
|
|
|
|
SVE_SIG_REGS_SIZE(vq));
|
|
|
|
if (err)
|
|
|
|
return -EFAULT;
|
|
|
|
|
2023-01-31 22:20:43 +00:00
|
|
|
if (flags & SVE_SIG_FLAG_SM)
|
2022-05-10 17:12:01 +01:00
|
|
|
current->thread.svcr |= SVCR_SM_MASK;
|
2022-04-19 12:22:26 +01:00
|
|
|
else
|
|
|
|
set_thread_flag(TIF_SVE);
|
2022-11-15 09:46:34 +00:00
|
|
|
current->thread.fp_type = FP_STATE_SVE;
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
|
2025-05-08 14:26:24 +01:00
|
|
|
err = read_fpsimd_context(&fpsimd, user);
|
|
|
|
if (err)
|
|
|
|
return err;
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
|
2025-05-08 14:26:24 +01:00
|
|
|
/* Merge the FPSIMD registers into the SVE state */
|
|
|
|
fpsimd_update_current_state(&fpsimd);
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
|
2025-05-08 14:26:24 +01:00
|
|
|
return 0;
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
#else /* ! CONFIG_ARM64_SVE */
|
|
|
|
|
2022-06-24 18:21:08 +01:00
|
|
|
static int restore_sve_fpsimd_context(struct user_ctxs *user)
|
|
|
|
{
|
|
|
|
WARN_ON_ONCE(1);
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Turn any non-optimised out attempts to use this into a link error: */
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
extern int preserve_sve_context(void __user *ctx);
|
|
|
|
|
|
|
|
#endif /* ! CONFIG_ARM64_SVE */
|
|
|
|
|
2022-04-19 12:22:27 +01:00
|
|
|
#ifdef CONFIG_ARM64_SME
|
|
|
|
|
2022-12-27 14:20:41 +00:00
|
|
|
static int preserve_tpidr2_context(struct tpidr2_context __user *ctx)
|
|
|
|
{
|
2025-04-09 17:40:10 +01:00
|
|
|
u64 tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0);
|
2022-12-27 14:20:41 +00:00
|
|
|
int err = 0;
|
|
|
|
|
|
|
|
__put_user_error(TPIDR2_MAGIC, &ctx->head.magic, err);
|
|
|
|
__put_user_error(sizeof(*ctx), &ctx->head.size, err);
|
2025-04-09 17:40:10 +01:00
|
|
|
__put_user_error(tpidr2_el0, &ctx->tpidr2, err);
|
2022-12-27 14:20:41 +00:00
|
|
|
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int restore_tpidr2_context(struct user_ctxs *user)
|
|
|
|
{
|
|
|
|
u64 tpidr2_el0;
|
|
|
|
int err = 0;
|
|
|
|
|
2023-01-31 22:20:42 +00:00
|
|
|
if (user->tpidr2_size != sizeof(*user->tpidr2))
|
|
|
|
return -EINVAL;
|
|
|
|
|
2022-12-27 14:20:41 +00:00
|
|
|
__get_user_error(tpidr2_el0, &user->tpidr2->tpidr2, err);
|
|
|
|
if (!err)
|
2023-06-22 14:39:45 +01:00
|
|
|
write_sysreg_s(tpidr2_el0, SYS_TPIDR2_EL0);
|
2022-12-27 14:20:41 +00:00
|
|
|
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2022-04-19 12:22:27 +01:00
|
|
|
static int preserve_za_context(struct za_context __user *ctx)
|
|
|
|
{
|
|
|
|
int err = 0;
|
|
|
|
u16 reserved[ARRAY_SIZE(ctx->__reserved)];
|
|
|
|
unsigned int vl = task_get_sme_vl(current);
|
|
|
|
unsigned int vq;
|
|
|
|
|
|
|
|
if (thread_za_enabled(¤t->thread))
|
|
|
|
vq = sve_vq_from_vl(vl);
|
|
|
|
else
|
|
|
|
vq = 0;
|
|
|
|
|
|
|
|
memset(reserved, 0, sizeof(reserved));
|
|
|
|
|
|
|
|
__put_user_error(ZA_MAGIC, &ctx->head.magic, err);
|
|
|
|
__put_user_error(round_up(ZA_SIG_CONTEXT_SIZE(vq), 16),
|
|
|
|
&ctx->head.size, err);
|
|
|
|
__put_user_error(vl, &ctx->vl, err);
|
|
|
|
BUILD_BUG_ON(sizeof(ctx->__reserved) != sizeof(reserved));
|
|
|
|
err |= __copy_to_user(&ctx->__reserved, reserved, sizeof(reserved));
|
|
|
|
|
|
|
|
if (vq) {
|
|
|
|
err |= __copy_to_user((char __user *)ctx + ZA_SIG_REGS_OFFSET,
|
2023-01-16 16:04:36 +00:00
|
|
|
current->thread.sme_state,
|
2022-04-19 12:22:27 +01:00
|
|
|
ZA_SIG_REGS_SIZE(vq));
|
|
|
|
}
|
|
|
|
|
|
|
|
return err ? -EFAULT : 0;
|
|
|
|
}
|
|
|
|
|
2022-06-01 18:13:38 +01:00
|
|
|
static int restore_za_context(struct user_ctxs *user)
|
2022-04-19 12:22:27 +01:00
|
|
|
{
|
2023-01-31 22:20:44 +00:00
|
|
|
int err = 0;
|
2022-04-19 12:22:27 +01:00
|
|
|
unsigned int vq;
|
2023-01-31 22:20:44 +00:00
|
|
|
u16 user_vl;
|
2022-04-19 12:22:27 +01:00
|
|
|
|
2023-01-31 22:20:42 +00:00
|
|
|
if (user->za_size < sizeof(*user->za))
|
|
|
|
return -EINVAL;
|
2022-04-19 12:22:27 +01:00
|
|
|
|
2023-01-31 22:20:44 +00:00
|
|
|
__get_user_error(user_vl, &(user->za->vl), err);
|
|
|
|
if (err)
|
|
|
|
return err;
|
2022-04-19 12:22:27 +01:00
|
|
|
|
2023-01-31 22:20:44 +00:00
|
|
|
if (user_vl != task_get_sme_vl(current))
|
2022-04-19 12:22:27 +01:00
|
|
|
return -EINVAL;
|
|
|
|
|
2023-01-31 22:20:42 +00:00
|
|
|
if (user->za_size == sizeof(*user->za)) {
|
2022-05-10 17:12:01 +01:00
|
|
|
current->thread.svcr &= ~SVCR_ZA_MASK;
|
2022-04-19 12:22:27 +01:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2023-01-31 22:20:44 +00:00
|
|
|
vq = sve_vq_from_vl(user_vl);
|
2022-04-19 12:22:27 +01:00
|
|
|
|
2023-01-31 22:20:42 +00:00
|
|
|
if (user->za_size < ZA_SIG_CONTEXT_SIZE(vq))
|
2022-04-19 12:22:27 +01:00
|
|
|
return -EINVAL;
|
|
|
|
|
2023-08-10 12:28:19 +01:00
|
|
|
sme_alloc(current, true);
|
2023-01-16 16:04:36 +00:00
|
|
|
if (!current->thread.sme_state) {
|
2022-05-10 17:12:01 +01:00
|
|
|
current->thread.svcr &= ~SVCR_ZA_MASK;
|
2022-04-19 12:22:27 +01:00
|
|
|
clear_thread_flag(TIF_SME);
|
|
|
|
return -ENOMEM;
|
|
|
|
}
|
|
|
|
|
2023-01-16 16:04:36 +00:00
|
|
|
err = __copy_from_user(current->thread.sme_state,
|
2022-04-19 12:22:27 +01:00
|
|
|
(char __user const *)user->za +
|
|
|
|
ZA_SIG_REGS_OFFSET,
|
|
|
|
ZA_SIG_REGS_SIZE(vq));
|
|
|
|
if (err)
|
|
|
|
return -EFAULT;
|
|
|
|
|
|
|
|
set_thread_flag(TIF_SME);
|
2022-05-10 17:12:01 +01:00
|
|
|
current->thread.svcr |= SVCR_ZA_MASK;
|
2022-04-19 12:22:27 +01:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
2023-01-16 16:04:46 +00:00
|
|
|
|
|
|
|
static int preserve_zt_context(struct zt_context __user *ctx)
|
|
|
|
{
|
|
|
|
int err = 0;
|
|
|
|
u16 reserved[ARRAY_SIZE(ctx->__reserved)];
|
|
|
|
|
|
|
|
if (WARN_ON(!thread_za_enabled(¤t->thread)))
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
memset(reserved, 0, sizeof(reserved));
|
|
|
|
|
|
|
|
__put_user_error(ZT_MAGIC, &ctx->head.magic, err);
|
|
|
|
__put_user_error(round_up(ZT_SIG_CONTEXT_SIZE(1), 16),
|
|
|
|
&ctx->head.size, err);
|
|
|
|
__put_user_error(1, &ctx->nregs, err);
|
|
|
|
BUILD_BUG_ON(sizeof(ctx->__reserved) != sizeof(reserved));
|
|
|
|
err |= __copy_to_user(&ctx->__reserved, reserved, sizeof(reserved));
|
|
|
|
|
|
|
|
err |= __copy_to_user((char __user *)ctx + ZT_SIG_REGS_OFFSET,
|
|
|
|
thread_zt_state(¤t->thread),
|
|
|
|
ZT_SIG_REGS_SIZE(1));
|
|
|
|
|
|
|
|
return err ? -EFAULT : 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int restore_zt_context(struct user_ctxs *user)
|
|
|
|
{
|
|
|
|
int err;
|
2023-01-31 22:20:45 +00:00
|
|
|
u16 nregs;
|
2023-01-16 16:04:46 +00:00
|
|
|
|
|
|
|
/* ZA must be restored first for this check to be valid */
|
|
|
|
if (!thread_za_enabled(¤t->thread))
|
|
|
|
return -EINVAL;
|
|
|
|
|
2023-01-31 22:20:42 +00:00
|
|
|
if (user->zt_size != ZT_SIG_CONTEXT_SIZE(1))
|
|
|
|
return -EINVAL;
|
|
|
|
|
2023-01-31 22:20:45 +00:00
|
|
|
if (__copy_from_user(&nregs, &(user->zt->nregs), sizeof(nregs)))
|
2023-01-16 16:04:46 +00:00
|
|
|
return -EFAULT;
|
|
|
|
|
2023-01-31 22:20:45 +00:00
|
|
|
if (nregs != 1)
|
2023-01-16 16:04:46 +00:00
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
err = __copy_from_user(thread_zt_state(¤t->thread),
|
|
|
|
(char __user const *)user->zt +
|
|
|
|
ZT_SIG_REGS_OFFSET,
|
|
|
|
ZT_SIG_REGS_SIZE(1));
|
|
|
|
if (err)
|
|
|
|
return -EFAULT;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2022-04-19 12:22:27 +01:00
|
|
|
#else /* ! CONFIG_ARM64_SME */
|
|
|
|
|
|
|
|
/* Turn any non-optimised out attempts to use these into a link error: */
|
2022-12-27 14:20:41 +00:00
|
|
|
extern int preserve_tpidr2_context(void __user *ctx);
|
|
|
|
extern int restore_tpidr2_context(struct user_ctxs *user);
|
2022-04-19 12:22:27 +01:00
|
|
|
extern int preserve_za_context(void __user *ctx);
|
|
|
|
extern int restore_za_context(struct user_ctxs *user);
|
2023-01-16 16:04:46 +00:00
|
|
|
extern int preserve_zt_context(void __user *ctx);
|
|
|
|
extern int restore_zt_context(struct user_ctxs *user);
|
2022-04-19 12:22:27 +01:00
|
|
|
|
|
|
|
#endif /* ! CONFIG_ARM64_SME */
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
|
2024-10-01 23:59:05 +01:00
|
|
|
#ifdef CONFIG_ARM64_GCS
|
|
|
|
|
|
|
|
static int preserve_gcs_context(struct gcs_context __user *ctx)
|
|
|
|
{
|
|
|
|
int err = 0;
|
|
|
|
u64 gcspr = read_sysreg_s(SYS_GCSPR_EL0);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If GCS is enabled we will add a cap token to the frame,
|
|
|
|
* include it in the GCSPR_EL0 we report to support stack
|
|
|
|
* switching via sigreturn if GCS is enabled. We do not allow
|
|
|
|
* enabling via sigreturn so the token is only relevant for
|
|
|
|
* threads with GCS enabled.
|
|
|
|
*/
|
|
|
|
if (task_gcs_el0_enabled(current))
|
|
|
|
gcspr -= 8;
|
|
|
|
|
|
|
|
__put_user_error(GCS_MAGIC, &ctx->head.magic, err);
|
|
|
|
__put_user_error(sizeof(*ctx), &ctx->head.size, err);
|
|
|
|
__put_user_error(gcspr, &ctx->gcspr, err);
|
|
|
|
__put_user_error(0, &ctx->reserved, err);
|
|
|
|
__put_user_error(current->thread.gcs_el0_mode,
|
|
|
|
&ctx->features_enabled, err);
|
|
|
|
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int restore_gcs_context(struct user_ctxs *user)
|
|
|
|
{
|
|
|
|
u64 gcspr, enabled;
|
|
|
|
int err = 0;
|
|
|
|
|
|
|
|
if (user->gcs_size != sizeof(*user->gcs))
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
__get_user_error(gcspr, &user->gcs->gcspr, err);
|
|
|
|
__get_user_error(enabled, &user->gcs->features_enabled, err);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
/* Don't allow unknown modes */
|
|
|
|
if (enabled & ~PR_SHADOW_STACK_SUPPORTED_STATUS_MASK)
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
err = gcs_check_locked(current, enabled);
|
|
|
|
if (err != 0)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
/* Don't allow enabling */
|
|
|
|
if (!task_gcs_el0_enabled(current) &&
|
|
|
|
(enabled & PR_SHADOW_STACK_ENABLE))
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
/* If we are disabling disable everything */
|
|
|
|
if (!(enabled & PR_SHADOW_STACK_ENABLE))
|
|
|
|
enabled = 0;
|
|
|
|
|
|
|
|
current->thread.gcs_el0_mode = enabled;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We let userspace set GCSPR_EL0 to anything here, we will
|
|
|
|
* validate later in gcs_restore_signal().
|
|
|
|
*/
|
|
|
|
write_sysreg_s(gcspr, SYS_GCSPR_EL0);
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
#else /* ! CONFIG_ARM64_GCS */
|
|
|
|
|
|
|
|
/* Turn any non-optimised out attempts to use these into a link error: */
|
|
|
|
extern int preserve_gcs_context(void __user *ctx);
|
|
|
|
extern int restore_gcs_context(struct user_ctxs *user);
|
|
|
|
|
|
|
|
#endif /* ! CONFIG_ARM64_GCS */
|
|
|
|
|
2017-06-15 15:03:39 +01:00
|
|
|
static int parse_user_sigframe(struct user_ctxs *user,
|
|
|
|
struct rt_sigframe __user *sf)
|
|
|
|
{
|
|
|
|
struct sigcontext __user *const sc = &sf->uc.uc_mcontext;
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
struct _aarch64_ctx __user *head;
|
|
|
|
char __user *base = (char __user *)&sc->__reserved;
|
2017-06-15 15:03:39 +01:00
|
|
|
size_t offset = 0;
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
size_t limit = sizeof(sc->__reserved);
|
2017-06-20 18:23:39 +01:00
|
|
|
bool have_extra_context = false;
|
|
|
|
char const __user *const sfp = (char const __user *)sf;
|
2017-06-15 15:03:39 +01:00
|
|
|
|
|
|
|
user->fpsimd = NULL;
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
user->sve = NULL;
|
2022-12-27 14:20:41 +00:00
|
|
|
user->tpidr2 = NULL;
|
2022-04-19 12:22:27 +01:00
|
|
|
user->za = NULL;
|
2023-01-16 16:04:46 +00:00
|
|
|
user->zt = NULL;
|
2024-03-06 23:14:49 +00:00
|
|
|
user->fpmr = NULL;
|
2024-08-22 16:11:02 +01:00
|
|
|
user->poe = NULL;
|
2024-10-01 23:59:05 +01:00
|
|
|
user->gcs = NULL;
|
2017-06-15 15:03:39 +01:00
|
|
|
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
if (!IS_ALIGNED((unsigned long)base, 16))
|
|
|
|
goto invalid;
|
|
|
|
|
2017-06-15 15:03:39 +01:00
|
|
|
while (1) {
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
int err = 0;
|
2017-06-15 15:03:39 +01:00
|
|
|
u32 magic, size;
|
2017-06-20 18:23:39 +01:00
|
|
|
char const __user *userp;
|
|
|
|
struct extra_context const __user *extra;
|
|
|
|
u64 extra_datap;
|
|
|
|
u32 extra_size;
|
|
|
|
struct _aarch64_ctx const __user *end;
|
|
|
|
u32 end_magic, end_size;
|
2017-06-15 15:03:39 +01:00
|
|
|
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
if (limit - offset < sizeof(*head))
|
2017-06-15 15:03:39 +01:00
|
|
|
goto invalid;
|
|
|
|
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
if (!IS_ALIGNED(offset, 16))
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
head = (struct _aarch64_ctx __user *)(base + offset);
|
2017-06-15 15:03:39 +01:00
|
|
|
__get_user_error(magic, &head->magic, err);
|
|
|
|
__get_user_error(size, &head->size, err);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
if (limit - offset < size)
|
|
|
|
goto invalid;
|
|
|
|
|
2017-06-15 15:03:39 +01:00
|
|
|
switch (magic) {
|
|
|
|
case 0:
|
|
|
|
if (size)
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
goto done;
|
|
|
|
|
|
|
|
case FPSIMD_MAGIC:
|
2020-01-13 23:30:22 +00:00
|
|
|
if (!system_supports_fpsimd())
|
|
|
|
goto invalid;
|
2017-06-15 15:03:39 +01:00
|
|
|
if (user->fpsimd)
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
user->fpsimd = (struct fpsimd_context __user *)head;
|
2023-01-31 22:20:42 +00:00
|
|
|
user->fpsimd_size = size;
|
2017-06-15 15:03:39 +01:00
|
|
|
break;
|
|
|
|
|
|
|
|
case ESR_MAGIC:
|
|
|
|
/* ignore */
|
|
|
|
break;
|
|
|
|
|
2024-08-22 16:11:02 +01:00
|
|
|
case POE_MAGIC:
|
|
|
|
if (!system_supports_poe())
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
if (user->poe)
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
user->poe = (struct poe_context __user *)head;
|
|
|
|
user->poe_size = size;
|
|
|
|
break;
|
|
|
|
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
case SVE_MAGIC:
|
2022-04-19 12:22:26 +01:00
|
|
|
if (!system_supports_sve() && !system_supports_sme())
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
if (user->sve)
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
user->sve = (struct sve_context __user *)head;
|
2023-01-31 22:20:42 +00:00
|
|
|
user->sve_size = size;
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
break;
|
|
|
|
|
2022-12-27 14:20:41 +00:00
|
|
|
case TPIDR2_MAGIC:
|
2023-03-17 20:49:12 +08:00
|
|
|
if (!system_supports_tpidr2())
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
goto invalid;
|
|
|
|
|
2022-12-27 14:20:41 +00:00
|
|
|
if (user->tpidr2)
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
user->tpidr2 = (struct tpidr2_context __user *)head;
|
2023-01-31 22:20:42 +00:00
|
|
|
user->tpidr2_size = size;
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
break;
|
|
|
|
|
2022-04-19 12:22:27 +01:00
|
|
|
case ZA_MAGIC:
|
|
|
|
if (!system_supports_sme())
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
if (user->za)
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
user->za = (struct za_context __user *)head;
|
2023-01-31 22:20:42 +00:00
|
|
|
user->za_size = size;
|
2022-04-19 12:22:27 +01:00
|
|
|
break;
|
|
|
|
|
2023-01-16 16:04:46 +00:00
|
|
|
case ZT_MAGIC:
|
|
|
|
if (!system_supports_sme2())
|
2022-04-19 12:22:27 +01:00
|
|
|
goto invalid;
|
|
|
|
|
2023-01-16 16:04:46 +00:00
|
|
|
if (user->zt)
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
user->zt = (struct zt_context __user *)head;
|
2023-01-31 22:20:42 +00:00
|
|
|
user->zt_size = size;
|
2022-04-19 12:22:27 +01:00
|
|
|
break;
|
|
|
|
|
2024-03-06 23:14:49 +00:00
|
|
|
case FPMR_MAGIC:
|
|
|
|
if (!system_supports_fpmr())
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
if (user->fpmr)
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
user->fpmr = (struct fpmr_context __user *)head;
|
|
|
|
user->fpmr_size = size;
|
|
|
|
break;
|
|
|
|
|
2024-10-01 23:59:05 +01:00
|
|
|
case GCS_MAGIC:
|
|
|
|
if (!system_supports_gcs())
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
if (user->gcs)
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
user->gcs = (struct gcs_context __user *)head;
|
|
|
|
user->gcs_size = size;
|
|
|
|
break;
|
|
|
|
|
2017-06-20 18:23:39 +01:00
|
|
|
case EXTRA_MAGIC:
|
|
|
|
if (have_extra_context)
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
if (size < sizeof(*extra))
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
userp = (char const __user *)head;
|
|
|
|
|
|
|
|
extra = (struct extra_context const __user *)userp;
|
|
|
|
userp += size;
|
|
|
|
|
|
|
|
__get_user_error(extra_datap, &extra->datap, err);
|
|
|
|
__get_user_error(extra_size, &extra->size, err);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
/* Check for the dummy terminator in __reserved[]: */
|
|
|
|
|
|
|
|
if (limit - offset - size < TERMINATOR_SIZE)
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
end = (struct _aarch64_ctx const __user *)userp;
|
|
|
|
userp += TERMINATOR_SIZE;
|
|
|
|
|
|
|
|
__get_user_error(end_magic, &end->magic, err);
|
|
|
|
__get_user_error(end_size, &end->size, err);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
if (end_magic || end_size)
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
/* Prevent looping/repeated parsing of extra_context */
|
|
|
|
have_extra_context = true;
|
|
|
|
|
|
|
|
base = (__force void __user *)extra_datap;
|
|
|
|
if (!IS_ALIGNED((unsigned long)base, 16))
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
if (!IS_ALIGNED(extra_size, 16))
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
if (base != userp)
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
/* Reject "unreasonably large" frames: */
|
|
|
|
if (extra_size > sfp + SIGFRAME_MAXSZ - userp)
|
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Ignore trailing terminator in __reserved[]
|
|
|
|
* and start parsing extra data:
|
|
|
|
*/
|
|
|
|
offset = 0;
|
|
|
|
limit = extra_size;
|
2017-10-31 15:50:55 +00:00
|
|
|
|
Remove 'type' argument from access_ok() function
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.
It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access. But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.
A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model. And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.
This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.
There were a couple of notable cases:
- csky still had the old "verify_area()" name as an alias.
- the iter_iov code had magical hardcoded knowledge of the actual
values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
really used it)
- microblaze used the type argument for a debug printout
but other than those oddities this should be a total no-op patch.
I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something. Any missed conversion should be trivially fixable, though.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-03 18:57:57 -08:00
|
|
|
if (!access_ok(base, limit))
|
2017-10-31 15:50:55 +00:00
|
|
|
goto invalid;
|
|
|
|
|
2017-06-20 18:23:39 +01:00
|
|
|
continue;
|
|
|
|
|
2017-06-15 15:03:39 +01:00
|
|
|
default:
|
|
|
|
goto invalid;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (size < sizeof(*head))
|
|
|
|
goto invalid;
|
|
|
|
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
if (limit - offset < size)
|
2017-06-15 15:03:39 +01:00
|
|
|
goto invalid;
|
|
|
|
|
|
|
|
offset += size;
|
|
|
|
}
|
|
|
|
|
|
|
|
done:
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
invalid:
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
2012-03-05 11:49:31 +00:00
|
|
|
static int restore_sigframe(struct pt_regs *regs,
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
struct rt_sigframe __user *sf,
|
|
|
|
struct user_access_state *ua_state)
|
2012-03-05 11:49:31 +00:00
|
|
|
{
|
|
|
|
sigset_t set;
|
|
|
|
int i, err;
|
2017-06-15 15:03:39 +01:00
|
|
|
struct user_ctxs user;
|
2012-03-05 11:49:31 +00:00
|
|
|
|
|
|
|
err = __copy_from_user(&set, &sf->uc.uc_sigmask, sizeof(set));
|
|
|
|
if (err == 0)
|
|
|
|
set_current_blocked(&set);
|
|
|
|
|
|
|
|
for (i = 0; i < 31; i++)
|
|
|
|
__get_user_error(regs->regs[i], &sf->uc.uc_mcontext.regs[i],
|
|
|
|
err);
|
|
|
|
__get_user_error(regs->sp, &sf->uc.uc_mcontext.sp, err);
|
|
|
|
__get_user_error(regs->pc, &sf->uc.uc_mcontext.pc, err);
|
|
|
|
__get_user_error(regs->pstate, &sf->uc.uc_mcontext.pstate, err);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Avoid sys_rt_sigreturn() restarting.
|
|
|
|
*/
|
2017-08-01 15:35:54 +01:00
|
|
|
forget_syscall(regs);
|
2012-03-05 11:49:31 +00:00
|
|
|
|
arm64/fpsimd: signal: Always save+flush state early
There are several issues with the way the native signal handling code
manipulates FPSIMD/SVE/SME state, described in detail below. These
issues largely result from races with preemption and inconsistent
handling of live state vs saved state.
Known issues with native FPSIMD/SVE/SME state management include:
* On systems with FPMR, the code to save/restore the FPMR accesses the
register while it is not owned by the current task. Consequently, this
may corrupt the FPMR of the current task and/or may corrupt the FPMR
of an unrelated task. The FPMR save/restore has been broken since it
was introduced in commit:
8c46def44409fc91 ("arm64/signal: Add FPMR signal handling")
* On systems with SME, setup_return() modifies both the live register
state and the saved state register state regardless of whether the
task's state is live, and without holding the cpu fpsimd context.
Consequently:
- This may corrupt the state an unrelated task which has PSTATE.SM set
and/or PSTATE.ZA set.
- The task may enter the signal handler in streaming mode, and or with
ZA storage enabled unexpectedly.
- The task may enter the signal handler in non-streaming SVE mode with
stale SVE register state, which may have been inherited from
streaming SVE mode unexpectedly. Where the streaming and
non-streaming vector lengths differ, this may be packed into
registers arbitrarily.
This logic has been broken since it was introduced in commit:
40a8e87bb32855b3 ("arm64/sme: Disable ZA and streaming mode when handling signals")
Further incorrect manipulation of state was added in commits:
ea64baacbc36a0d5 ("arm64/signal: Flush FPSIMD register state when disabling streaming mode")
baa8515281b30861 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")
* Several restoration functions use fpsimd_flush_task_state() to discard
the live FPSIMD/SVE/SME while the in-memory copy is stale.
When a subset of the FPSIMD/SVE/SME state is restored, the remainder
may be non-deterministically reset to a stale snapshot from some
arbitrary point in the past.
This non-deterministic discarding was introduced in commit:
8cd969d28fd2848d ("arm64/sve: Signal handling support")
As of that commit, when TIF_SVE was initially clear, failure to
restore the SVE signal frame could reset the FPSIMD registers to a
stale snapshot.
The pattern of discarding unsaved state was subsequently copied into
restoration functions for some new state in commits:
39782210eb7e8763 ("arm64/sme: Implement ZA signal handling")
ee072cf708048c0d ("arm64/sme: Implement signal handling for ZT")
* On systems with SME/SME2, the entire FPSIMD/SVE/SME state may be
loaded onto the CPU redundantly. Either restore_fpsimd_context() or
restore_sve_fpsimd_context() will load the entire FPSIMD/SVE/SME state
via fpsimd_update_current_state() before restore_za_context() and
restore_zt_context() each discard the state via
fpsimd_flush_task_state().
This is purely redundant work, and not a functional bug.
To fix these issues, rework the native signal handling code to always
save+flush the current task's FPSIMD/SVE/SME state before manipulating
that state. This avoids races with preemption and ensures that state is
manipulated consistently regardless of whether it happened to be live
prior to manipulation. This largely involes:
* Using fpsimd_save_and_flush_current_state() to save+flush the state
for both signal delivery and signal return, before the state is
manipulated in any way.
* Removing fpsimd_signal_preserve_current_state() and updating
preserve_fpsimd_context() to explicitly ensure that the FPSIMD state
is up-to-date, as preserve_fpsimd_context() is the only consumer of
the FPSIMD state during signal delivery.
* Modifying fpsimd_update_current_state() to not reload the FPSIMD state
onto the CPU. Ideally we'd remove fpsimd_update_current_state()
entirely, but I've left that for subsequent patches as there are a
number of of other problems with the FPSIMD<->SVE conversion helpers
that should be addressed at the same time. For now I've removed the
misleading comment.
For setup_return(), we need to decide (for ABI reasons) whether signal
delivery should have all the side-effects of an SMSTOP. For now I've
left a TODO comment, as there are other questions in this area that I'll
address with subsequent patches.
Fixes: 8c46def44409 ("arm64/signal: Add FPMR signal handling")
Fixes: 40a8e87bb328 ("arm64/sme: Disable ZA and streaming mode when handling signals")
Fixes: ea64baacbc36 ("arm64/signal: Flush FPSIMD register state when disabling streaming mode")
Fixes: baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")
Fixes: 8cd969d28fd2 ("arm64/sve: Signal handling support")
Fixes: 39782210eb7e ("arm64/sme: Implement ZA signal handling")
Fixes: ee072cf70804 ("arm64/sme: Implement signal handling for ZT")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250409164010.3480271-13-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-09 17:40:09 +01:00
|
|
|
fpsimd_save_and_flush_current_state();
|
|
|
|
|
2016-03-01 14:18:50 +00:00
|
|
|
err |= !valid_user_regs(®s->user_regs, current);
|
2017-06-15 15:03:39 +01:00
|
|
|
if (err == 0)
|
|
|
|
err = parse_user_sigframe(&user, sf);
|
2012-03-05 11:49:31 +00:00
|
|
|
|
2020-01-13 23:30:22 +00:00
|
|
|
if (err == 0 && system_supports_fpsimd()) {
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
if (!user.fpsimd)
|
|
|
|
return -EINVAL;
|
|
|
|
|
2022-06-24 18:21:08 +01:00
|
|
|
if (user.sve)
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
err = restore_sve_fpsimd_context(&user);
|
2022-06-24 18:21:08 +01:00
|
|
|
else
|
2023-01-31 22:20:41 +00:00
|
|
|
err = restore_fpsimd_context(&user);
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
}
|
2012-03-05 11:49:31 +00:00
|
|
|
|
2024-10-01 23:59:05 +01:00
|
|
|
if (err == 0 && system_supports_gcs() && user.gcs)
|
|
|
|
err = restore_gcs_context(&user);
|
|
|
|
|
2023-03-17 20:49:12 +08:00
|
|
|
if (err == 0 && system_supports_tpidr2() && user.tpidr2)
|
2022-12-27 14:20:41 +00:00
|
|
|
err = restore_tpidr2_context(&user);
|
|
|
|
|
2024-03-06 23:14:49 +00:00
|
|
|
if (err == 0 && system_supports_fpmr() && user.fpmr)
|
|
|
|
err = restore_fpmr_context(&user);
|
|
|
|
|
2022-04-19 12:22:27 +01:00
|
|
|
if (err == 0 && system_supports_sme() && user.za)
|
|
|
|
err = restore_za_context(&user);
|
|
|
|
|
2023-01-16 16:04:46 +00:00
|
|
|
if (err == 0 && system_supports_sme2() && user.zt)
|
|
|
|
err = restore_zt_context(&user);
|
|
|
|
|
2024-08-22 16:11:02 +01:00
|
|
|
if (err == 0 && system_supports_poe() && user.poe)
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
err = restore_poe_context(&user, ua_state);
|
2024-08-22 16:11:02 +01:00
|
|
|
|
2012-03-05 11:49:31 +00:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2024-10-01 23:59:04 +01:00
|
|
|
#ifdef CONFIG_ARM64_GCS
|
|
|
|
static int gcs_restore_signal(void)
|
|
|
|
{
|
2024-12-14 02:12:06 +00:00
|
|
|
u64 gcspr_el0, cap;
|
2024-10-01 23:59:04 +01:00
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (!system_supports_gcs())
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
if (!(current->thread.gcs_el0_mode & PR_SHADOW_STACK_ENABLE))
|
|
|
|
return 0;
|
|
|
|
|
2024-12-14 02:12:06 +00:00
|
|
|
gcspr_el0 = read_sysreg_s(SYS_GCSPR_EL0);
|
2024-10-01 23:59:04 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Ensure that any changes to the GCS done via GCS operations
|
|
|
|
* are visible to the normal reads we do to validate the
|
|
|
|
* token.
|
|
|
|
*/
|
|
|
|
gcsb_dsync();
|
|
|
|
|
|
|
|
/*
|
|
|
|
* GCSPR_EL0 should be pointing at a capped GCS, read the cap.
|
|
|
|
* We don't enforce that this is in a GCS page, if it is not
|
|
|
|
* then faults will be generated on GCS operations - the main
|
|
|
|
* concern is to protect GCS pages.
|
|
|
|
*/
|
2024-12-14 02:12:06 +00:00
|
|
|
ret = copy_from_user(&cap, (unsigned long __user *)gcspr_el0,
|
|
|
|
sizeof(cap));
|
2024-10-01 23:59:04 +01:00
|
|
|
if (ret)
|
|
|
|
return -EFAULT;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Check that the cap is the actual GCS before replacing it.
|
|
|
|
*/
|
2024-12-14 02:12:06 +00:00
|
|
|
if (cap != GCS_SIGNAL_CAP(gcspr_el0))
|
2024-10-01 23:59:04 +01:00
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
/* Invalidate the token to prevent reuse */
|
2024-12-14 02:12:06 +00:00
|
|
|
put_user_gcs(0, (unsigned long __user *)gcspr_el0, &ret);
|
2024-10-01 23:59:04 +01:00
|
|
|
if (ret != 0)
|
|
|
|
return -EFAULT;
|
|
|
|
|
2024-12-14 02:12:06 +00:00
|
|
|
write_sysreg_s(gcspr_el0 + 8, SYS_GCSPR_EL0);
|
2024-10-01 23:59:04 +01:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
#else
|
|
|
|
static int gcs_restore_signal(void) { return 0; }
|
|
|
|
#endif
|
|
|
|
|
2018-07-11 14:56:53 +01:00
|
|
|
SYSCALL_DEFINE0(rt_sigreturn)
|
2012-03-05 11:49:31 +00:00
|
|
|
{
|
2018-07-11 14:56:41 +01:00
|
|
|
struct pt_regs *regs = current_pt_regs();
|
2012-03-05 11:49:31 +00:00
|
|
|
struct rt_sigframe __user *frame;
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
struct user_access_state ua_state;
|
2012-03-05 11:49:31 +00:00
|
|
|
|
|
|
|
/* Always make any pending restarted system calls return -EINTR */
|
2015-02-12 15:01:14 -08:00
|
|
|
current->restart_block.fn = do_no_restart_syscall;
|
2012-03-05 11:49:31 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Since we stacked the signal on a 128-bit boundary, then 'sp' should
|
|
|
|
* be word aligned here.
|
|
|
|
*/
|
|
|
|
if (regs->sp & 15)
|
|
|
|
goto badframe;
|
|
|
|
|
|
|
|
frame = (struct rt_sigframe __user *)regs->sp;
|
|
|
|
|
Remove 'type' argument from access_ok() function
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.
It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access. But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.
A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model. And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.
This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.
There were a couple of notable cases:
- csky still had the old "verify_area()" name as an alias.
- the iter_iov code had magical hardcoded knowledge of the actual
values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
really used it)
- microblaze used the type argument for a debug printout
but other than those oddities this should be a total no-op patch.
I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something. Any missed conversion should be trivially fixable, though.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-03 18:57:57 -08:00
|
|
|
if (!access_ok(frame, sizeof (*frame)))
|
2012-03-05 11:49:31 +00:00
|
|
|
goto badframe;
|
|
|
|
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
if (restore_sigframe(regs, frame, &ua_state))
|
2012-03-05 11:49:31 +00:00
|
|
|
goto badframe;
|
|
|
|
|
2024-10-01 23:59:04 +01:00
|
|
|
if (gcs_restore_signal())
|
|
|
|
goto badframe;
|
|
|
|
|
2012-12-23 01:56:45 -05:00
|
|
|
if (restore_altstack(&frame->uc.uc_stack))
|
2012-03-05 11:49:31 +00:00
|
|
|
goto badframe;
|
|
|
|
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
restore_user_access_state(&ua_state);
|
|
|
|
|
2012-03-05 11:49:31 +00:00
|
|
|
return regs->regs[0];
|
|
|
|
|
|
|
|
badframe:
|
2018-02-20 15:05:17 +00:00
|
|
|
arm64_notify_segfault(regs->sp);
|
2012-03-05 11:49:31 +00:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
arm64: signal: Report signal frame size to userspace via auxv
Stateful CPU architecture extensions may require the signal frame
to grow to a size that exceeds the arch's MINSIGSTKSZ #define.
However, changing this #define is an ABI break.
To allow userspace the option of determining the signal frame size
in a more forwards-compatible way, this patch adds a new auxv entry
tagged with AT_MINSIGSTKSZ, which provides the maximum signal frame
size that the process can observe during its lifetime.
If AT_MINSIGSTKSZ is absent from the aux vector, the caller can
assume that the MINSIGSTKSZ #define is sufficient. This allows for
a consistent interface with older kernels that do not provide
AT_MINSIGSTKSZ.
The idea is that libc could expose this via sysconf() or some
similar mechanism.
There is deliberately no AT_SIGSTKSZ. The kernel knows nothing
about userspace's own stack overheads and should not pretend to
know.
For arm64:
The primary motivation for this interface is the Scalable Vector
Extension, which can require at least 4KB or so of extra space
in the signal frame for the largest hardware implementations.
To determine the correct value, a "Christmas tree" mode (via the
add_all argument) is added to setup_sigframe_layout(), to simulate
addition of all possible records to the signal frame at maximum
possible size.
If this procedure goes wrong somehow, resulting in a stupidly large
frame layout and hence failure of sigframe_alloc() to allocate a
record to the frame, then this is indicative of a kernel bug. In
this case, we WARN() and no attempt is made to populate
AT_MINSIGSTKSZ for userspace.
For arm64 SVE:
The SVE context block in the signal frame needs to be considered
too when computing the maximum possible signal frame size.
Because the size of this block depends on the vector length, this
patch computes the size based not on the thread's current vector
length but instead on the maximum possible vector length: this
determines the maximum size of SVE context block that can be
observed in any signal frame for the lifetime of the process.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-06-01 11:10:14 +01:00
|
|
|
/*
|
|
|
|
* Determine the layout of optional records in the signal frame
|
|
|
|
*
|
|
|
|
* add_all: if true, lays out the biggest possible signal frame for
|
|
|
|
* this task; otherwise, generates a layout for the current state
|
|
|
|
* of the task.
|
|
|
|
*/
|
|
|
|
static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
|
|
|
|
bool add_all)
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
{
|
2017-06-15 15:03:41 +01:00
|
|
|
int err;
|
|
|
|
|
2022-02-25 11:40:08 +01:00
|
|
|
if (system_supports_fpsimd()) {
|
|
|
|
err = sigframe_alloc(user, &user->fpsimd_offset,
|
|
|
|
sizeof(struct fpsimd_context));
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
|
|
|
|
/* fault information, if valid */
|
arm64: signal: Report signal frame size to userspace via auxv
Stateful CPU architecture extensions may require the signal frame
to grow to a size that exceeds the arch's MINSIGSTKSZ #define.
However, changing this #define is an ABI break.
To allow userspace the option of determining the signal frame size
in a more forwards-compatible way, this patch adds a new auxv entry
tagged with AT_MINSIGSTKSZ, which provides the maximum signal frame
size that the process can observe during its lifetime.
If AT_MINSIGSTKSZ is absent from the aux vector, the caller can
assume that the MINSIGSTKSZ #define is sufficient. This allows for
a consistent interface with older kernels that do not provide
AT_MINSIGSTKSZ.
The idea is that libc could expose this via sysconf() or some
similar mechanism.
There is deliberately no AT_SIGSTKSZ. The kernel knows nothing
about userspace's own stack overheads and should not pretend to
know.
For arm64:
The primary motivation for this interface is the Scalable Vector
Extension, which can require at least 4KB or so of extra space
in the signal frame for the largest hardware implementations.
To determine the correct value, a "Christmas tree" mode (via the
add_all argument) is added to setup_sigframe_layout(), to simulate
addition of all possible records to the signal frame at maximum
possible size.
If this procedure goes wrong somehow, resulting in a stupidly large
frame layout and hence failure of sigframe_alloc() to allocate a
record to the frame, then this is indicative of a kernel bug. In
this case, we WARN() and no attempt is made to populate
AT_MINSIGSTKSZ for userspace.
For arm64 SVE:
The SVE context block in the signal frame needs to be considered
too when computing the maximum possible signal frame size.
Because the size of this block depends on the vector length, this
patch computes the size based not on the thread's current vector
length but instead on the maximum possible vector length: this
determines the maximum size of SVE context block that can be
observed in any signal frame for the lifetime of the process.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-06-01 11:10:14 +01:00
|
|
|
if (add_all || current->thread.fault_code) {
|
2017-06-15 15:03:41 +01:00
|
|
|
err = sigframe_alloc(user, &user->esr_offset,
|
|
|
|
sizeof(struct esr_context));
|
|
|
|
if (err)
|
|
|
|
return err;
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
}
|
|
|
|
|
2024-10-01 23:59:05 +01:00
|
|
|
#ifdef CONFIG_ARM64_GCS
|
|
|
|
if (system_supports_gcs() && (add_all || current->thread.gcspr_el0)) {
|
|
|
|
err = sigframe_alloc(user, &user->gcs_offset,
|
|
|
|
sizeof(struct gcs_context));
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2022-12-27 17:12:06 +00:00
|
|
|
if (system_supports_sve() || system_supports_sme()) {
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
unsigned int vq = 0;
|
|
|
|
|
2024-01-30 15:43:53 +00:00
|
|
|
if (add_all || current->thread.fp_type == FP_STATE_SVE ||
|
2022-04-19 12:22:26 +01:00
|
|
|
thread_sm_enabled(¤t->thread)) {
|
|
|
|
int vl = max(sve_max_vl(), sme_max_vl());
|
arm64: signal: Report signal frame size to userspace via auxv
Stateful CPU architecture extensions may require the signal frame
to grow to a size that exceeds the arch's MINSIGSTKSZ #define.
However, changing this #define is an ABI break.
To allow userspace the option of determining the signal frame size
in a more forwards-compatible way, this patch adds a new auxv entry
tagged with AT_MINSIGSTKSZ, which provides the maximum signal frame
size that the process can observe during its lifetime.
If AT_MINSIGSTKSZ is absent from the aux vector, the caller can
assume that the MINSIGSTKSZ #define is sufficient. This allows for
a consistent interface with older kernels that do not provide
AT_MINSIGSTKSZ.
The idea is that libc could expose this via sysconf() or some
similar mechanism.
There is deliberately no AT_SIGSTKSZ. The kernel knows nothing
about userspace's own stack overheads and should not pretend to
know.
For arm64:
The primary motivation for this interface is the Scalable Vector
Extension, which can require at least 4KB or so of extra space
in the signal frame for the largest hardware implementations.
To determine the correct value, a "Christmas tree" mode (via the
add_all argument) is added to setup_sigframe_layout(), to simulate
addition of all possible records to the signal frame at maximum
possible size.
If this procedure goes wrong somehow, resulting in a stupidly large
frame layout and hence failure of sigframe_alloc() to allocate a
record to the frame, then this is indicative of a kernel bug. In
this case, we WARN() and no attempt is made to populate
AT_MINSIGSTKSZ for userspace.
For arm64 SVE:
The SVE context block in the signal frame needs to be considered
too when computing the maximum possible signal frame size.
Because the size of this block depends on the vector length, this
patch computes the size based not on the thread's current vector
length but instead on the maximum possible vector length: this
determines the maximum size of SVE context block that can be
observed in any signal frame for the lifetime of the process.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-06-01 11:10:14 +01:00
|
|
|
|
|
|
|
if (!add_all)
|
2022-04-19 12:22:26 +01:00
|
|
|
vl = thread_get_cur_vl(¤t->thread);
|
arm64: signal: Report signal frame size to userspace via auxv
Stateful CPU architecture extensions may require the signal frame
to grow to a size that exceeds the arch's MINSIGSTKSZ #define.
However, changing this #define is an ABI break.
To allow userspace the option of determining the signal frame size
in a more forwards-compatible way, this patch adds a new auxv entry
tagged with AT_MINSIGSTKSZ, which provides the maximum signal frame
size that the process can observe during its lifetime.
If AT_MINSIGSTKSZ is absent from the aux vector, the caller can
assume that the MINSIGSTKSZ #define is sufficient. This allows for
a consistent interface with older kernels that do not provide
AT_MINSIGSTKSZ.
The idea is that libc could expose this via sysconf() or some
similar mechanism.
There is deliberately no AT_SIGSTKSZ. The kernel knows nothing
about userspace's own stack overheads and should not pretend to
know.
For arm64:
The primary motivation for this interface is the Scalable Vector
Extension, which can require at least 4KB or so of extra space
in the signal frame for the largest hardware implementations.
To determine the correct value, a "Christmas tree" mode (via the
add_all argument) is added to setup_sigframe_layout(), to simulate
addition of all possible records to the signal frame at maximum
possible size.
If this procedure goes wrong somehow, resulting in a stupidly large
frame layout and hence failure of sigframe_alloc() to allocate a
record to the frame, then this is indicative of a kernel bug. In
this case, we WARN() and no attempt is made to populate
AT_MINSIGSTKSZ for userspace.
For arm64 SVE:
The SVE context block in the signal frame needs to be considered
too when computing the maximum possible signal frame size.
Because the size of this block depends on the vector length, this
patch computes the size based not on the thread's current vector
length but instead on the maximum possible vector length: this
determines the maximum size of SVE context block that can be
observed in any signal frame for the lifetime of the process.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-06-01 11:10:14 +01:00
|
|
|
|
|
|
|
vq = sve_vq_from_vl(vl);
|
|
|
|
}
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
|
|
|
|
err = sigframe_alloc(user, &user->sve_offset,
|
|
|
|
SVE_SIG_CONTEXT_SIZE(vq));
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2023-03-17 20:49:13 +08:00
|
|
|
if (system_supports_tpidr2()) {
|
|
|
|
err = sigframe_alloc(user, &user->tpidr2_offset,
|
|
|
|
sizeof(struct tpidr2_context));
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2022-04-19 12:22:27 +01:00
|
|
|
if (system_supports_sme()) {
|
|
|
|
unsigned int vl;
|
|
|
|
unsigned int vq = 0;
|
|
|
|
|
|
|
|
if (add_all)
|
|
|
|
vl = sme_max_vl();
|
|
|
|
else
|
|
|
|
vl = task_get_sme_vl(current);
|
|
|
|
|
|
|
|
if (thread_za_enabled(¤t->thread))
|
|
|
|
vq = sve_vq_from_vl(vl);
|
|
|
|
|
|
|
|
err = sigframe_alloc(user, &user->za_offset,
|
|
|
|
ZA_SIG_CONTEXT_SIZE(vq));
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2023-01-16 16:04:46 +00:00
|
|
|
if (system_supports_sme2()) {
|
|
|
|
if (add_all || thread_za_enabled(¤t->thread)) {
|
|
|
|
err = sigframe_alloc(user, &user->zt_offset,
|
|
|
|
ZT_SIG_CONTEXT_SIZE(1));
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2024-03-06 23:14:49 +00:00
|
|
|
if (system_supports_fpmr()) {
|
|
|
|
err = sigframe_alloc(user, &user->fpmr_offset,
|
|
|
|
sizeof(struct fpmr_context));
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2024-08-22 16:11:02 +01:00
|
|
|
if (system_supports_poe()) {
|
|
|
|
err = sigframe_alloc(user, &user->poe_offset,
|
|
|
|
sizeof(struct poe_context));
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2017-06-20 18:23:39 +01:00
|
|
|
return sigframe_alloc_end(user);
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
}
|
|
|
|
|
2017-06-15 15:03:38 +01:00
|
|
|
static int setup_sigframe(struct rt_sigframe_user_layout *user,
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
struct pt_regs *regs, sigset_t *set,
|
|
|
|
const struct user_access_state *ua_state)
|
2012-03-05 11:49:31 +00:00
|
|
|
{
|
|
|
|
int i, err = 0;
|
2017-06-15 15:03:38 +01:00
|
|
|
struct rt_sigframe __user *sf = user->sigframe;
|
2012-03-05 11:49:31 +00:00
|
|
|
|
2012-11-23 12:34:13 +00:00
|
|
|
/* set up the stack frame for unwinding */
|
2017-06-15 15:03:38 +01:00
|
|
|
__put_user_error(regs->regs[29], &user->next_frame->fp, err);
|
|
|
|
__put_user_error(regs->regs[30], &user->next_frame->lr, err);
|
2012-11-23 12:34:13 +00:00
|
|
|
|
2012-03-05 11:49:31 +00:00
|
|
|
for (i = 0; i < 31; i++)
|
|
|
|
__put_user_error(regs->regs[i], &sf->uc.uc_mcontext.regs[i],
|
|
|
|
err);
|
|
|
|
__put_user_error(regs->sp, &sf->uc.uc_mcontext.sp, err);
|
|
|
|
__put_user_error(regs->pc, &sf->uc.uc_mcontext.pc, err);
|
|
|
|
__put_user_error(regs->pstate, &sf->uc.uc_mcontext.pstate, err);
|
|
|
|
|
|
|
|
__put_user_error(current->thread.fault_address, &sf->uc.uc_mcontext.fault_address, err);
|
|
|
|
|
|
|
|
err |= __copy_to_user(&sf->uc.uc_sigmask, set, sizeof(*set));
|
|
|
|
|
2020-01-13 23:30:22 +00:00
|
|
|
if (err == 0 && system_supports_fpsimd()) {
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
struct fpsimd_context __user *fpsimd_ctx =
|
|
|
|
apply_user_offset(user, user->fpsimd_offset);
|
2014-04-04 15:42:16 +01:00
|
|
|
err |= preserve_fpsimd_context(fpsimd_ctx);
|
|
|
|
}
|
2012-03-05 11:49:31 +00:00
|
|
|
|
2013-09-16 15:19:27 +01:00
|
|
|
/* fault information, if valid */
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
if (err == 0 && user->esr_offset) {
|
|
|
|
struct esr_context __user *esr_ctx =
|
|
|
|
apply_user_offset(user, user->esr_offset);
|
|
|
|
|
2013-09-16 15:19:27 +01:00
|
|
|
__put_user_error(ESR_MAGIC, &esr_ctx->head.magic, err);
|
|
|
|
__put_user_error(sizeof(*esr_ctx), &esr_ctx->head.size, err);
|
|
|
|
__put_user_error(current->thread.fault_code, &esr_ctx->esr, err);
|
|
|
|
}
|
|
|
|
|
2024-10-01 23:59:05 +01:00
|
|
|
if (system_supports_gcs() && err == 0 && user->gcs_offset) {
|
|
|
|
struct gcs_context __user *gcs_ctx =
|
|
|
|
apply_user_offset(user, user->gcs_offset);
|
|
|
|
err |= preserve_gcs_context(gcs_ctx);
|
|
|
|
}
|
|
|
|
|
2022-04-19 12:22:26 +01:00
|
|
|
/* Scalable Vector Extension state (including streaming), if present */
|
|
|
|
if ((system_supports_sve() || system_supports_sme()) &&
|
|
|
|
err == 0 && user->sve_offset) {
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
struct sve_context __user *sve_ctx =
|
|
|
|
apply_user_offset(user, user->sve_offset);
|
|
|
|
err |= preserve_sve_context(sve_ctx);
|
|
|
|
}
|
|
|
|
|
2022-12-27 14:20:41 +00:00
|
|
|
/* TPIDR2 if supported */
|
2023-03-17 20:49:12 +08:00
|
|
|
if (system_supports_tpidr2() && err == 0) {
|
2022-12-27 14:20:41 +00:00
|
|
|
struct tpidr2_context __user *tpidr2_ctx =
|
|
|
|
apply_user_offset(user, user->tpidr2_offset);
|
|
|
|
err |= preserve_tpidr2_context(tpidr2_ctx);
|
|
|
|
}
|
|
|
|
|
2024-03-06 23:14:49 +00:00
|
|
|
/* FPMR if supported */
|
|
|
|
if (system_supports_fpmr() && err == 0) {
|
|
|
|
struct fpmr_context __user *fpmr_ctx =
|
|
|
|
apply_user_offset(user, user->fpmr_offset);
|
|
|
|
err |= preserve_fpmr_context(fpmr_ctx);
|
|
|
|
}
|
|
|
|
|
2024-10-29 14:45:36 +00:00
|
|
|
if (system_supports_poe() && err == 0) {
|
2024-08-22 16:11:02 +01:00
|
|
|
struct poe_context __user *poe_ctx =
|
|
|
|
apply_user_offset(user, user->poe_offset);
|
|
|
|
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
err |= preserve_poe_context(poe_ctx, ua_state);
|
2024-08-22 16:11:02 +01:00
|
|
|
}
|
|
|
|
|
2022-04-19 12:22:27 +01:00
|
|
|
/* ZA state if present */
|
|
|
|
if (system_supports_sme() && err == 0 && user->za_offset) {
|
|
|
|
struct za_context __user *za_ctx =
|
|
|
|
apply_user_offset(user, user->za_offset);
|
|
|
|
err |= preserve_za_context(za_ctx);
|
|
|
|
}
|
|
|
|
|
2023-01-16 16:04:46 +00:00
|
|
|
/* ZT state if present */
|
|
|
|
if (system_supports_sme2() && err == 0 && user->zt_offset) {
|
|
|
|
struct zt_context __user *zt_ctx =
|
|
|
|
apply_user_offset(user, user->zt_offset);
|
|
|
|
err |= preserve_zt_context(zt_ctx);
|
|
|
|
}
|
|
|
|
|
2017-06-20 18:23:39 +01:00
|
|
|
if (err == 0 && user->extra_offset) {
|
|
|
|
char __user *sfp = (char __user *)user->sigframe;
|
|
|
|
char __user *userp =
|
|
|
|
apply_user_offset(user, user->extra_offset);
|
|
|
|
|
|
|
|
struct extra_context __user *extra;
|
|
|
|
struct _aarch64_ctx __user *end;
|
|
|
|
u64 extra_datap;
|
|
|
|
u32 extra_size;
|
|
|
|
|
|
|
|
extra = (struct extra_context __user *)userp;
|
|
|
|
userp += EXTRA_CONTEXT_SIZE;
|
|
|
|
|
|
|
|
end = (struct _aarch64_ctx __user *)userp;
|
|
|
|
userp += TERMINATOR_SIZE;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* extra_datap is just written to the signal frame.
|
|
|
|
* The value gets cast back to a void __user *
|
|
|
|
* during sigreturn.
|
|
|
|
*/
|
|
|
|
extra_datap = (__force u64)userp;
|
|
|
|
extra_size = sfp + round_up(user->size, 16) - userp;
|
|
|
|
|
|
|
|
__put_user_error(EXTRA_MAGIC, &extra->head.magic, err);
|
|
|
|
__put_user_error(EXTRA_CONTEXT_SIZE, &extra->head.size, err);
|
|
|
|
__put_user_error(extra_datap, &extra->datap, err);
|
|
|
|
__put_user_error(extra_size, &extra->size, err);
|
|
|
|
|
|
|
|
/* Add the terminator */
|
|
|
|
__put_user_error(0, &end->magic, err);
|
|
|
|
__put_user_error(0, &end->size, err);
|
|
|
|
}
|
|
|
|
|
2012-03-05 11:49:31 +00:00
|
|
|
/* set the "end" magic */
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
if (err == 0) {
|
|
|
|
struct _aarch64_ctx __user *end =
|
|
|
|
apply_user_offset(user, user->end_offset);
|
|
|
|
|
|
|
|
__put_user_error(0, &end->magic, err);
|
|
|
|
__put_user_error(0, &end->size, err);
|
|
|
|
}
|
2012-03-05 11:49:31 +00:00
|
|
|
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2017-06-15 15:03:38 +01:00
|
|
|
static int get_sigframe(struct rt_sigframe_user_layout *user,
|
|
|
|
struct ksignal *ksig, struct pt_regs *regs)
|
2012-03-05 11:49:31 +00:00
|
|
|
{
|
|
|
|
unsigned long sp, sp_top;
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
int err;
|
|
|
|
|
|
|
|
init_user_layout(user);
|
arm64: signal: Report signal frame size to userspace via auxv
Stateful CPU architecture extensions may require the signal frame
to grow to a size that exceeds the arch's MINSIGSTKSZ #define.
However, changing this #define is an ABI break.
To allow userspace the option of determining the signal frame size
in a more forwards-compatible way, this patch adds a new auxv entry
tagged with AT_MINSIGSTKSZ, which provides the maximum signal frame
size that the process can observe during its lifetime.
If AT_MINSIGSTKSZ is absent from the aux vector, the caller can
assume that the MINSIGSTKSZ #define is sufficient. This allows for
a consistent interface with older kernels that do not provide
AT_MINSIGSTKSZ.
The idea is that libc could expose this via sysconf() or some
similar mechanism.
There is deliberately no AT_SIGSTKSZ. The kernel knows nothing
about userspace's own stack overheads and should not pretend to
know.
For arm64:
The primary motivation for this interface is the Scalable Vector
Extension, which can require at least 4KB or so of extra space
in the signal frame for the largest hardware implementations.
To determine the correct value, a "Christmas tree" mode (via the
add_all argument) is added to setup_sigframe_layout(), to simulate
addition of all possible records to the signal frame at maximum
possible size.
If this procedure goes wrong somehow, resulting in a stupidly large
frame layout and hence failure of sigframe_alloc() to allocate a
record to the frame, then this is indicative of a kernel bug. In
this case, we WARN() and no attempt is made to populate
AT_MINSIGSTKSZ for userspace.
For arm64 SVE:
The SVE context block in the signal frame needs to be considered
too when computing the maximum possible signal frame size.
Because the size of this block depends on the vector length, this
patch computes the size based not on the thread's current vector
length but instead on the maximum possible vector length: this
determines the maximum size of SVE context block that can be
observed in any signal frame for the lifetime of the process.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-06-01 11:10:14 +01:00
|
|
|
err = setup_sigframe_layout(user, false);
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
if (err)
|
|
|
|
return err;
|
2012-03-05 11:49:31 +00:00
|
|
|
|
2014-03-05 13:31:20 +01:00
|
|
|
sp = sp_top = sigsp(regs->sp, ksig);
|
2012-03-05 11:49:31 +00:00
|
|
|
|
2017-06-15 15:03:38 +01:00
|
|
|
sp = round_down(sp - sizeof(struct frame_record), 16);
|
|
|
|
user->next_frame = (struct frame_record __user *)sp;
|
|
|
|
|
arm64: signal: factor frame layout and population into separate passes
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 15:03:40 +01:00
|
|
|
sp = round_down(sp, 16) - sigframe_size(user);
|
2017-06-15 15:03:38 +01:00
|
|
|
user->sigframe = (struct rt_sigframe __user *)sp;
|
2012-03-05 11:49:31 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Check that we can actually write to the signal frame.
|
|
|
|
*/
|
Remove 'type' argument from access_ok() function
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.
It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access. But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.
A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model. And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.
This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.
There were a couple of notable cases:
- csky still had the old "verify_area()" name as an alias.
- the iter_iov code had magical hardcoded knowledge of the actual
values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
really used it)
- microblaze used the type argument for a debug printout
but other than those oddities this should be a total no-op patch.
I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something. Any missed conversion should be trivially fixable, though.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-03 18:57:57 -08:00
|
|
|
if (!access_ok(user->sigframe, sp_top - sp))
|
2017-06-15 15:03:38 +01:00
|
|
|
return -EFAULT;
|
2012-03-05 11:49:31 +00:00
|
|
|
|
2017-06-15 15:03:38 +01:00
|
|
|
return 0;
|
2012-03-05 11:49:31 +00:00
|
|
|
}
|
|
|
|
|
2024-10-01 23:59:04 +01:00
|
|
|
#ifdef CONFIG_ARM64_GCS
|
|
|
|
|
|
|
|
static int gcs_signal_entry(__sigrestore_t sigtramp, struct ksignal *ksig)
|
|
|
|
{
|
2024-12-14 02:12:06 +00:00
|
|
|
u64 gcspr_el0;
|
2024-10-01 23:59:04 +01:00
|
|
|
int ret = 0;
|
|
|
|
|
|
|
|
if (!system_supports_gcs())
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
if (!task_gcs_el0_enabled(current))
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We are entering a signal handler, current register state is
|
|
|
|
* active.
|
|
|
|
*/
|
2024-12-14 02:12:06 +00:00
|
|
|
gcspr_el0 = read_sysreg_s(SYS_GCSPR_EL0);
|
2024-10-01 23:59:04 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Push a cap and the GCS entry for the trampoline onto the GCS.
|
|
|
|
*/
|
2024-12-14 02:12:06 +00:00
|
|
|
put_user_gcs((unsigned long)sigtramp,
|
|
|
|
(unsigned long __user *)(gcspr_el0 - 16), &ret);
|
|
|
|
put_user_gcs(GCS_SIGNAL_CAP(gcspr_el0 - 8),
|
|
|
|
(unsigned long __user *)(gcspr_el0 - 8), &ret);
|
2024-10-01 23:59:04 +01:00
|
|
|
if (ret != 0)
|
|
|
|
return ret;
|
|
|
|
|
2024-12-14 02:12:06 +00:00
|
|
|
gcspr_el0 -= 16;
|
|
|
|
write_sysreg_s(gcspr_el0, SYS_GCSPR_EL0);
|
2024-10-01 23:59:04 +01:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
#else
|
|
|
|
|
|
|
|
static int gcs_signal_entry(__sigrestore_t sigtramp, struct ksignal *ksig)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
#endif
|
|
|
|
|
|
|
|
static int setup_return(struct pt_regs *regs, struct ksignal *ksig,
|
2017-06-15 15:03:38 +01:00
|
|
|
struct rt_sigframe_user_layout *user, int usig)
|
2012-03-05 11:49:31 +00:00
|
|
|
{
|
|
|
|
__sigrestore_t sigtramp;
|
arm64: signal: Ensure signal delivery failure is recoverable
Commit eaf62ce1563b ("arm64/signal: Set up and restore the GCS
context for signal handlers") introduced a potential failure point
at the end of setup_return(). This is unfortunate as it is too late
to deliver a SIGSEGV: if that SIGSEGV is handled, the subsequent
sigreturn will end up returning to the original handler, which is
not the intention (since we failed to deliver that signal).
Make sure this does not happen by calling gcs_signal_entry()
at the very beginning of setup_return(), and add a comment just
after to discourage error cases being introduced from that point
onwards.
While at it, also take care of copy_siginfo_to_user(): since it may
fail, we shouldn't be calling it after setup_return() either. Call
it before setup_return() instead, and move the setting of X1/X2
inside setup_return() where it belongs (after the "point of no
failure").
Background: the first part of setup_rt_frame(), including
setup_sigframe(), has no impact on the execution of the interrupted
thread. The signal frame is written to the stack, but the stack
pointer remains unchanged. Failure at this stage can be recovered by
a SIGSEGV handler, and sigreturn will restore the original context,
at the point where the original signal occurred. On the other hand,
once setup_return() has updated registers including SP, the thread's
control flow has been modified and we must deliver the original
signal.
Fixes: eaf62ce1563b ("arm64/signal: Set up and restore the GCS context for signal handlers")
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241210160940.2031997-1-kevin.brodsky@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-10 16:09:40 +00:00
|
|
|
int err;
|
|
|
|
|
|
|
|
if (ksig->ka.sa.sa_flags & SA_RESTORER)
|
|
|
|
sigtramp = ksig->ka.sa.sa_restorer;
|
|
|
|
else
|
|
|
|
sigtramp = VDSO_SYMBOL(current->mm->context.vdso, sigtramp);
|
|
|
|
|
|
|
|
err = gcs_signal_entry(sigtramp, ksig);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We must not fail from this point onwards. We are going to update
|
|
|
|
* registers, including SP, in order to invoke the signal handler. If
|
|
|
|
* we failed and attempted to deliver a nested SIGSEGV to a handler
|
|
|
|
* after that point, the subsequent sigreturn would end up restoring
|
|
|
|
* the (partial) state for the original signal handler.
|
|
|
|
*/
|
2012-03-05 11:49:31 +00:00
|
|
|
|
|
|
|
regs->regs[0] = usig;
|
arm64: signal: Ensure signal delivery failure is recoverable
Commit eaf62ce1563b ("arm64/signal: Set up and restore the GCS
context for signal handlers") introduced a potential failure point
at the end of setup_return(). This is unfortunate as it is too late
to deliver a SIGSEGV: if that SIGSEGV is handled, the subsequent
sigreturn will end up returning to the original handler, which is
not the intention (since we failed to deliver that signal).
Make sure this does not happen by calling gcs_signal_entry()
at the very beginning of setup_return(), and add a comment just
after to discourage error cases being introduced from that point
onwards.
While at it, also take care of copy_siginfo_to_user(): since it may
fail, we shouldn't be calling it after setup_return() either. Call
it before setup_return() instead, and move the setting of X1/X2
inside setup_return() where it belongs (after the "point of no
failure").
Background: the first part of setup_rt_frame(), including
setup_sigframe(), has no impact on the execution of the interrupted
thread. The signal frame is written to the stack, but the stack
pointer remains unchanged. Failure at this stage can be recovered by
a SIGSEGV handler, and sigreturn will restore the original context,
at the point where the original signal occurred. On the other hand,
once setup_return() has updated registers including SP, the thread's
control flow has been modified and we must deliver the original
signal.
Fixes: eaf62ce1563b ("arm64/signal: Set up and restore the GCS context for signal handlers")
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241210160940.2031997-1-kevin.brodsky@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-10 16:09:40 +00:00
|
|
|
if (ksig->ka.sa.sa_flags & SA_SIGINFO) {
|
|
|
|
regs->regs[1] = (unsigned long)&user->sigframe->info;
|
|
|
|
regs->regs[2] = (unsigned long)&user->sigframe->uc;
|
|
|
|
}
|
2017-06-15 15:03:38 +01:00
|
|
|
regs->sp = (unsigned long)user->sigframe;
|
|
|
|
regs->regs[29] = (unsigned long)&user->next_frame->fp;
|
arm64: signal: Ensure signal delivery failure is recoverable
Commit eaf62ce1563b ("arm64/signal: Set up and restore the GCS
context for signal handlers") introduced a potential failure point
at the end of setup_return(). This is unfortunate as it is too late
to deliver a SIGSEGV: if that SIGSEGV is handled, the subsequent
sigreturn will end up returning to the original handler, which is
not the intention (since we failed to deliver that signal).
Make sure this does not happen by calling gcs_signal_entry()
at the very beginning of setup_return(), and add a comment just
after to discourage error cases being introduced from that point
onwards.
While at it, also take care of copy_siginfo_to_user(): since it may
fail, we shouldn't be calling it after setup_return() either. Call
it before setup_return() instead, and move the setting of X1/X2
inside setup_return() where it belongs (after the "point of no
failure").
Background: the first part of setup_rt_frame(), including
setup_sigframe(), has no impact on the execution of the interrupted
thread. The signal frame is written to the stack, but the stack
pointer remains unchanged. Failure at this stage can be recovered by
a SIGSEGV handler, and sigreturn will restore the original context,
at the point where the original signal occurred. On the other hand,
once setup_return() has updated registers including SP, the thread's
control flow has been modified and we must deliver the original
signal.
Fixes: eaf62ce1563b ("arm64/signal: Set up and restore the GCS context for signal handlers")
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241210160940.2031997-1-kevin.brodsky@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-10 16:09:40 +00:00
|
|
|
regs->regs[30] = (unsigned long)sigtramp;
|
2024-10-01 23:59:04 +01:00
|
|
|
regs->pc = (unsigned long)ksig->ka.sa.sa_handler;
|
2012-03-05 11:49:31 +00:00
|
|
|
|
2020-03-16 16:50:45 +00:00
|
|
|
/*
|
|
|
|
* Signal delivery is a (wacky) indirect function call in
|
|
|
|
* userspace, so simulate the same setting of BTYPE as a BLR
|
|
|
|
* <register containing the signal handler entry point>.
|
|
|
|
* Signal delivery to a location in a PROT_BTI guarded page
|
|
|
|
* that is not a function entry point will now trigger a
|
|
|
|
* SIGILL in userspace.
|
|
|
|
*
|
|
|
|
* If the signal handler entry point is not in a PROT_BTI
|
|
|
|
* guarded page, this is harmless.
|
|
|
|
*/
|
|
|
|
if (system_supports_bti()) {
|
|
|
|
regs->pstate &= ~PSR_BTYPE_MASK;
|
|
|
|
regs->pstate |= PSR_BTYPE_C;
|
|
|
|
}
|
|
|
|
|
2019-09-16 11:51:17 +01:00
|
|
|
/* TCO (Tag Check Override) always cleared for signal handlers */
|
|
|
|
regs->pstate &= ~PSR_TCO_BIT;
|
|
|
|
|
2022-04-19 12:22:25 +01:00
|
|
|
/* Signal handlers are invoked with ZA and streaming mode disabled */
|
|
|
|
if (system_supports_sme()) {
|
2025-05-08 14:26:29 +01:00
|
|
|
task_smstop_sm(current);
|
|
|
|
current->thread.svcr &= ~SVCR_ZA_MASK;
|
arm64/fpsimd: signal: Clear TPIDR2 when delivering signals
Linux is intended to be compatible with userspace written to Arm's
AAPCS64 procedure call standard [1,2]. For the Scalable Matrix Extension
(SME), AAPCS64 was extended with a "ZA lazy saving scheme", where SME's
ZA tile is lazily callee-saved and caller-restored. In this scheme,
TPIDR2_EL0 indicates whether the ZA tile is live or has been saved by
pointing to a "TPIDR2 block" in memory, which has a "za_save_buffer"
pointer. This scheme has been implemented in GCC and LLVM, with
necessary runtime support implemented in glibc.
AAPCS64 does not specify how the ZA lazy saving scheme is expected to
interact with signal handling, and the behaviour that AAPCS64 currently
recommends for (sig)setjmp() and (sig)longjmp() does not always compose
safely with signal handling, as explained below.
When Linux delivers a signal, it creates signal frames which contain the
original values of PSTATE.ZA, the ZA tile, and TPIDR_EL2. Between saving
the original state and entering the signal handler, Linux clears
PSTATE.ZA, but leaves TPIDR2_EL0 unchanged. Consequently a signal
handler can be entered with PSTATE.ZA=0 (meaning accesses to ZA will
trap), while TPIDR_EL0 is non-null (which may indicate that ZA needs to
be lazily saved, depending on the contents of the TPIDR2 block). While
in this state, libc and/or compiler runtime code, such as longjmp(), may
attempt to save ZA. As PSTATE.ZA=0, these accesses will trap, causing
the kernel to inject a SIGILL. Note that by virtue of lazy saving
occurring in libc and/or C runtime code, this can be triggered by
application/library code which is unaware of SME.
To avoid the problem above, the kernel must ensure that signal handlers
are entered with PSTATE.ZA and TPIDR2_EL0 configured in a manner which
complies with the ZA lazy saving scheme. Practically speaking, the only
choice is to enter signal handlers with PSTATE.ZA=0 and TPIDR2_EL0=NULL.
This change should not impact SME code which does not follow the ZA lazy
saving scheme (and hence does not use TPIDR2_EL0).
An alternative approach that was considered is to have the signal
handler inherit the original values of both PSTATE.ZA and TPIDR2_EL0,
relying on lazy save/restore sequences being idempotent and capable of
racing safely. This is not safe as signal handlers must be assumed to
have a "private ZA" interface, and therefore cannot be entered with
PSTATE.ZA=1 and TPIDR2_EL0=NULL, but it is legitimate for signals to be
taken from this state.
With the kernel fixed to clear TPIDR2_EL0, there are a couple of
remaining issues (largely masked by the first issue) that must be fixed
in userspace:
(1) When a (sig)setjmp() + (sig)longjmp() pair cross a signal boundary,
ZA state may be discarded when it needs to be preserved.
Currently, the ZA lazy saving scheme recommends that setjmp() does
not save ZA, and recommends that longjmp() is responsible for saving
ZA. A call to longjmp() in a signal handler will not have visibility
of ZA state that existed prior to entry to the signal, and when a
longjmp() is used to bypass a usual signal return, unsaved ZA state
will be discarded erroneously.
To fix this, it is necessary for setjmp() to eagerly save ZA state,
and for longjmp() to configure PSTATE.ZA=0 and TPIDR2_EL0=NULL. This
works regardless of whether a signal boundary is crossed.
(2) When a C++ exception is thrown and crosses a signal boundary before
it is caught, ZA state may be discarded when it needs to be
preserved.
AAPCS64 requires that exception handlers are entered with
PSTATE.{SM,ZA}={0,0} and TPIDR2_EL0=NULL, with exception unwind code
expected to perform any necessary save of ZA state.
Where it is necessary to perform an exception unwind across an
exception boundary, the unwind code must recover any necessary ZA
state (along with TPIDR2) from signal frames.
Fix the kernel as described above, with setup_return() clearing
TPIDR2_EL0 when delivering a signal. Folk on CC are working on fixes for
the remaining userspace issues, including updates/fixes to the AAPCS64
specification and glibc.
[1] https://github.com/ARM-software/abi-aa/releases/download/2025Q1/aapcs64.pdf
[2] https://github.com/ARM-software/abi-aa/blob/c51addc3dc03e73a016a1e4edf25440bcac76431/aapcs64/aapcs64.rst
Fixes: 39782210eb7e ("arm64/sme: Implement ZA signal handling")
Fixes: 39e54499280f ("arm64/signal: Include TPIDR2 in the signal context")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Daniel Kiss <daniel.kiss@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Richard Sandiford <richard.sandiford@arm.com>
Cc: Sander De Smalen <sander.desmalen@arm.com>
Cc: Tamas Petz <tamas.petz@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yury Khrustalev <yury.khrustalev@arm.com>
Link: https://lore.kernel.org/r/20250417190113.3778111-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-17 20:01:13 +01:00
|
|
|
write_sysreg_s(0, SYS_TPIDR2_EL0);
|
2022-04-19 12:22:25 +01:00
|
|
|
}
|
|
|
|
|
arm64: signal: Ensure signal delivery failure is recoverable
Commit eaf62ce1563b ("arm64/signal: Set up and restore the GCS
context for signal handlers") introduced a potential failure point
at the end of setup_return(). This is unfortunate as it is too late
to deliver a SIGSEGV: if that SIGSEGV is handled, the subsequent
sigreturn will end up returning to the original handler, which is
not the intention (since we failed to deliver that signal).
Make sure this does not happen by calling gcs_signal_entry()
at the very beginning of setup_return(), and add a comment just
after to discourage error cases being introduced from that point
onwards.
While at it, also take care of copy_siginfo_to_user(): since it may
fail, we shouldn't be calling it after setup_return() either. Call
it before setup_return() instead, and move the setting of X1/X2
inside setup_return() where it belongs (after the "point of no
failure").
Background: the first part of setup_rt_frame(), including
setup_sigframe(), has no impact on the execution of the interrupted
thread. The signal frame is written to the stack, but the stack
pointer remains unchanged. Failure at this stage can be recovered by
a SIGSEGV handler, and sigreturn will restore the original context,
at the point where the original signal occurred. On the other hand,
once setup_return() has updated registers including SP, the thread's
control flow has been modified and we must deliver the original
signal.
Fixes: eaf62ce1563b ("arm64/signal: Set up and restore the GCS context for signal handlers")
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241210160940.2031997-1-kevin.brodsky@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-10 16:09:40 +00:00
|
|
|
return 0;
|
2012-03-05 11:49:31 +00:00
|
|
|
}
|
|
|
|
|
2013-10-06 22:52:44 +02:00
|
|
|
static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
|
|
|
|
struct pt_regs *regs)
|
2012-03-05 11:49:31 +00:00
|
|
|
{
|
2017-06-15 15:03:38 +01:00
|
|
|
struct rt_sigframe_user_layout user;
|
2012-03-05 11:49:31 +00:00
|
|
|
struct rt_sigframe __user *frame;
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
struct user_access_state ua_state;
|
2012-03-05 11:49:31 +00:00
|
|
|
int err = 0;
|
|
|
|
|
arm64/fpsimd: signal: Always save+flush state early
There are several issues with the way the native signal handling code
manipulates FPSIMD/SVE/SME state, described in detail below. These
issues largely result from races with preemption and inconsistent
handling of live state vs saved state.
Known issues with native FPSIMD/SVE/SME state management include:
* On systems with FPMR, the code to save/restore the FPMR accesses the
register while it is not owned by the current task. Consequently, this
may corrupt the FPMR of the current task and/or may corrupt the FPMR
of an unrelated task. The FPMR save/restore has been broken since it
was introduced in commit:
8c46def44409fc91 ("arm64/signal: Add FPMR signal handling")
* On systems with SME, setup_return() modifies both the live register
state and the saved state register state regardless of whether the
task's state is live, and without holding the cpu fpsimd context.
Consequently:
- This may corrupt the state an unrelated task which has PSTATE.SM set
and/or PSTATE.ZA set.
- The task may enter the signal handler in streaming mode, and or with
ZA storage enabled unexpectedly.
- The task may enter the signal handler in non-streaming SVE mode with
stale SVE register state, which may have been inherited from
streaming SVE mode unexpectedly. Where the streaming and
non-streaming vector lengths differ, this may be packed into
registers arbitrarily.
This logic has been broken since it was introduced in commit:
40a8e87bb32855b3 ("arm64/sme: Disable ZA and streaming mode when handling signals")
Further incorrect manipulation of state was added in commits:
ea64baacbc36a0d5 ("arm64/signal: Flush FPSIMD register state when disabling streaming mode")
baa8515281b30861 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")
* Several restoration functions use fpsimd_flush_task_state() to discard
the live FPSIMD/SVE/SME while the in-memory copy is stale.
When a subset of the FPSIMD/SVE/SME state is restored, the remainder
may be non-deterministically reset to a stale snapshot from some
arbitrary point in the past.
This non-deterministic discarding was introduced in commit:
8cd969d28fd2848d ("arm64/sve: Signal handling support")
As of that commit, when TIF_SVE was initially clear, failure to
restore the SVE signal frame could reset the FPSIMD registers to a
stale snapshot.
The pattern of discarding unsaved state was subsequently copied into
restoration functions for some new state in commits:
39782210eb7e8763 ("arm64/sme: Implement ZA signal handling")
ee072cf708048c0d ("arm64/sme: Implement signal handling for ZT")
* On systems with SME/SME2, the entire FPSIMD/SVE/SME state may be
loaded onto the CPU redundantly. Either restore_fpsimd_context() or
restore_sve_fpsimd_context() will load the entire FPSIMD/SVE/SME state
via fpsimd_update_current_state() before restore_za_context() and
restore_zt_context() each discard the state via
fpsimd_flush_task_state().
This is purely redundant work, and not a functional bug.
To fix these issues, rework the native signal handling code to always
save+flush the current task's FPSIMD/SVE/SME state before manipulating
that state. This avoids races with preemption and ensures that state is
manipulated consistently regardless of whether it happened to be live
prior to manipulation. This largely involes:
* Using fpsimd_save_and_flush_current_state() to save+flush the state
for both signal delivery and signal return, before the state is
manipulated in any way.
* Removing fpsimd_signal_preserve_current_state() and updating
preserve_fpsimd_context() to explicitly ensure that the FPSIMD state
is up-to-date, as preserve_fpsimd_context() is the only consumer of
the FPSIMD state during signal delivery.
* Modifying fpsimd_update_current_state() to not reload the FPSIMD state
onto the CPU. Ideally we'd remove fpsimd_update_current_state()
entirely, but I've left that for subsequent patches as there are a
number of of other problems with the FPSIMD<->SVE conversion helpers
that should be addressed at the same time. For now I've removed the
misleading comment.
For setup_return(), we need to decide (for ABI reasons) whether signal
delivery should have all the side-effects of an SMSTOP. For now I've
left a TODO comment, as there are other questions in this area that I'll
address with subsequent patches.
Fixes: 8c46def44409 ("arm64/signal: Add FPMR signal handling")
Fixes: 40a8e87bb328 ("arm64/sme: Disable ZA and streaming mode when handling signals")
Fixes: ea64baacbc36 ("arm64/signal: Flush FPSIMD register state when disabling streaming mode")
Fixes: baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE")
Fixes: 8cd969d28fd2 ("arm64/sve: Signal handling support")
Fixes: 39782210eb7e ("arm64/sme: Implement ZA signal handling")
Fixes: ee072cf70804 ("arm64/sme: Implement signal handling for ZT")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250409164010.3480271-13-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-09 17:40:09 +01:00
|
|
|
fpsimd_save_and_flush_current_state();
|
arm64/sve: Signal handling support
This patch implements support for saving and restoring the SVE
registers around signals.
A fixed-size header struct sve_context is always included in the
signal frame encoding the thread's vector length at the time of
signal delivery, optionally followed by a variable-layout structure
encoding the SVE registers.
Because of the need to preserve backwards compatibility, the FPSIMD
view of the SVE registers is always dumped as a struct
fpsimd_context in the usual way, in addition to any sve_context.
The SVE vector registers are dumped in full, including bits 127:0
of each register which alias the corresponding FPSIMD vector
registers in the hardware. To avoid any ambiguity about which
alias to restore during sigreturn, the kernel always restores bits
127:0 of each SVE vector register from the fpsimd_context in the
signal frame (which must be present): userspace needs to take this
into account if it wants to modify the SVE vector register contents
on return from a signal.
FPSR and FPCR, which are used by both FPSIMD and SVE, are not
included in sve_context because they are always present in
fpsimd_context anyway.
For signal delivery, a new helper
fpsimd_signal_preserve_current_state() is added to update _both_
the FPSIMD and SVE views in the task struct, to make it easier to
populate this information into the signal frame. Because of the
redundancy between the two views of the state, only one is updated
otherwise.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:07 +00:00
|
|
|
|
2017-06-15 15:03:38 +01:00
|
|
|
if (get_sigframe(&user, ksig, regs))
|
2012-03-05 11:49:31 +00:00
|
|
|
return 1;
|
|
|
|
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
save_reset_user_access_state(&ua_state);
|
2017-06-15 15:03:38 +01:00
|
|
|
frame = user.sigframe;
|
|
|
|
|
2012-03-05 11:49:31 +00:00
|
|
|
__put_user_error(0, &frame->uc.uc_flags, err);
|
|
|
|
__put_user_error(NULL, &frame->uc.uc_link, err);
|
|
|
|
|
2012-12-23 01:56:45 -05:00
|
|
|
err |= __save_altstack(&frame->uc.uc_stack, regs->sp);
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
err |= setup_sigframe(&user, regs, set, &ua_state);
|
arm64: signal: Ensure signal delivery failure is recoverable
Commit eaf62ce1563b ("arm64/signal: Set up and restore the GCS
context for signal handlers") introduced a potential failure point
at the end of setup_return(). This is unfortunate as it is too late
to deliver a SIGSEGV: if that SIGSEGV is handled, the subsequent
sigreturn will end up returning to the original handler, which is
not the intention (since we failed to deliver that signal).
Make sure this does not happen by calling gcs_signal_entry()
at the very beginning of setup_return(), and add a comment just
after to discourage error cases being introduced from that point
onwards.
While at it, also take care of copy_siginfo_to_user(): since it may
fail, we shouldn't be calling it after setup_return() either. Call
it before setup_return() instead, and move the setting of X1/X2
inside setup_return() where it belongs (after the "point of no
failure").
Background: the first part of setup_rt_frame(), including
setup_sigframe(), has no impact on the execution of the interrupted
thread. The signal frame is written to the stack, but the stack
pointer remains unchanged. Failure at this stage can be recovered by
a SIGSEGV handler, and sigreturn will restore the original context,
at the point where the original signal occurred. On the other hand,
once setup_return() has updated registers including SP, the thread's
control flow has been modified and we must deliver the original
signal.
Fixes: eaf62ce1563b ("arm64/signal: Set up and restore the GCS context for signal handlers")
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241210160940.2031997-1-kevin.brodsky@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-10 16:09:40 +00:00
|
|
|
if (ksig->ka.sa.sa_flags & SA_SIGINFO)
|
|
|
|
err |= copy_siginfo_to_user(&frame->info, &ksig->info);
|
|
|
|
|
|
|
|
if (err == 0)
|
2024-10-01 23:59:04 +01:00
|
|
|
err = setup_return(regs, ksig, &user, usig);
|
arm64: signal: Ensure signal delivery failure is recoverable
Commit eaf62ce1563b ("arm64/signal: Set up and restore the GCS
context for signal handlers") introduced a potential failure point
at the end of setup_return(). This is unfortunate as it is too late
to deliver a SIGSEGV: if that SIGSEGV is handled, the subsequent
sigreturn will end up returning to the original handler, which is
not the intention (since we failed to deliver that signal).
Make sure this does not happen by calling gcs_signal_entry()
at the very beginning of setup_return(), and add a comment just
after to discourage error cases being introduced from that point
onwards.
While at it, also take care of copy_siginfo_to_user(): since it may
fail, we shouldn't be calling it after setup_return() either. Call
it before setup_return() instead, and move the setting of X1/X2
inside setup_return() where it belongs (after the "point of no
failure").
Background: the first part of setup_rt_frame(), including
setup_sigframe(), has no impact on the execution of the interrupted
thread. The signal frame is written to the stack, but the stack
pointer remains unchanged. Failure at this stage can be recovered by
a SIGSEGV handler, and sigreturn will restore the original context,
at the point where the original signal occurred. On the other hand,
once setup_return() has updated registers including SP, the thread's
control flow has been modified and we must deliver the original
signal.
Fixes: eaf62ce1563b ("arm64/signal: Set up and restore the GCS context for signal handlers")
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20241210160940.2031997-1-kevin.brodsky@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-10 16:09:40 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* We must not fail if setup_return() succeeded - see comment at the
|
|
|
|
* beginning of setup_return().
|
|
|
|
*/
|
2012-03-05 11:49:31 +00:00
|
|
|
|
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
Reset POR_EL0 to "allow all" before writing the signal frame, preventing
spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Fixes: 9160f7e909e1 ("arm64: add POE signal support")
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-10-29 14:45:35 +00:00
|
|
|
if (err == 0)
|
|
|
|
set_handler_user_access_state();
|
|
|
|
else
|
|
|
|
restore_user_access_state(&ua_state);
|
|
|
|
|
2012-03-05 11:49:31 +00:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void setup_restart_syscall(struct pt_regs *regs)
|
|
|
|
{
|
|
|
|
if (is_compat_task())
|
|
|
|
compat_setup_restart_syscall(regs);
|
|
|
|
else
|
|
|
|
regs->regs[8] = __NR_restart_syscall;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* OK, we're invoking a handler
|
|
|
|
*/
|
2013-10-06 22:52:44 +02:00
|
|
|
static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
|
2012-03-05 11:49:31 +00:00
|
|
|
{
|
|
|
|
sigset_t *oldset = sigmask_to_save();
|
2013-10-06 22:52:44 +02:00
|
|
|
int usig = ksig->sig;
|
2012-03-05 11:49:31 +00:00
|
|
|
int ret;
|
|
|
|
|
2018-06-20 14:46:50 +01:00
|
|
|
rseq_signal_deliver(ksig, regs);
|
|
|
|
|
2012-03-05 11:49:31 +00:00
|
|
|
/*
|
|
|
|
* Set up the stack frame
|
|
|
|
*/
|
|
|
|
if (is_compat_task()) {
|
2013-10-06 22:52:44 +02:00
|
|
|
if (ksig->ka.sa.sa_flags & SA_SIGINFO)
|
|
|
|
ret = compat_setup_rt_frame(usig, ksig, oldset, regs);
|
2012-03-05 11:49:31 +00:00
|
|
|
else
|
2013-10-06 22:52:44 +02:00
|
|
|
ret = compat_setup_frame(usig, ksig, oldset, regs);
|
2012-03-05 11:49:31 +00:00
|
|
|
} else {
|
2013-10-06 22:52:44 +02:00
|
|
|
ret = setup_rt_frame(usig, ksig, oldset, regs);
|
2012-03-05 11:49:31 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Check that the resulting registers are actually sane.
|
|
|
|
*/
|
2016-03-01 14:18:50 +00:00
|
|
|
ret |= !valid_user_regs(®s->user_regs, current);
|
2012-03-05 11:49:31 +00:00
|
|
|
|
2020-07-02 21:16:20 +01:00
|
|
|
/* Step into the signal handler if we are stepping */
|
|
|
|
signal_setup_done(ret, ksig, test_thread_flag(TIF_SINGLESTEP));
|
2012-03-05 11:49:31 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Note that 'init' is a special process: it doesn't get signals it doesn't
|
|
|
|
* want to handle. Thus you cannot kill init even with a SIGKILL even by
|
|
|
|
* mistake.
|
|
|
|
*
|
|
|
|
* Note that we go through the signals twice: once to check the signals that
|
|
|
|
* the kernel can handle, and then we build all the user-level signal handling
|
|
|
|
* stack-frames in one go after that.
|
|
|
|
*/
|
2024-02-06 12:38:47 +00:00
|
|
|
void do_signal(struct pt_regs *regs)
|
2012-03-05 11:49:31 +00:00
|
|
|
{
|
|
|
|
unsigned long continue_addr = 0, restart_addr = 0;
|
2013-10-06 22:52:44 +02:00
|
|
|
int retval = 0;
|
|
|
|
struct ksignal ksig;
|
2018-06-07 12:32:05 +01:00
|
|
|
bool syscall = in_syscall(regs);
|
2012-03-05 11:49:31 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If we were from a system call, check for system call restarting...
|
|
|
|
*/
|
2018-06-07 12:32:05 +01:00
|
|
|
if (syscall) {
|
2012-03-05 11:49:31 +00:00
|
|
|
continue_addr = regs->pc;
|
|
|
|
restart_addr = continue_addr - (compat_thumb_mode(regs) ? 2 : 4);
|
|
|
|
retval = regs->regs[0];
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Avoid additional syscall restarting via ret_to_user.
|
|
|
|
*/
|
2017-08-01 15:35:54 +01:00
|
|
|
forget_syscall(regs);
|
2012-03-05 11:49:31 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Prepare for system call restart. We do this here so that a
|
|
|
|
* debugger will see the already changed PC.
|
|
|
|
*/
|
|
|
|
switch (retval) {
|
|
|
|
case -ERESTARTNOHAND:
|
|
|
|
case -ERESTARTSYS:
|
|
|
|
case -ERESTARTNOINTR:
|
|
|
|
case -ERESTART_RESTARTBLOCK:
|
|
|
|
regs->regs[0] = regs->orig_x0;
|
|
|
|
regs->pc = restart_addr;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Get the signal to deliver. When running under ptrace, at this point
|
|
|
|
* the debugger may change all of our registers.
|
|
|
|
*/
|
2013-10-06 22:52:44 +02:00
|
|
|
if (get_signal(&ksig)) {
|
2012-03-05 11:49:31 +00:00
|
|
|
/*
|
|
|
|
* Depending on the signal settings, we may need to revert the
|
|
|
|
* decision to restart the system call, but skip this if a
|
|
|
|
* debugger has chosen to restart at a different PC.
|
|
|
|
*/
|
|
|
|
if (regs->pc == restart_addr &&
|
|
|
|
(retval == -ERESTARTNOHAND ||
|
|
|
|
retval == -ERESTART_RESTARTBLOCK ||
|
|
|
|
(retval == -ERESTARTSYS &&
|
2013-10-06 22:52:44 +02:00
|
|
|
!(ksig.ka.sa.sa_flags & SA_RESTART)))) {
|
arm64: fix compat syscall return truncation
Due to inconsistencies in the way we manipulate compat GPRs, we have a
few issues today:
* For audit and tracing, where error codes are handled as a (native)
long, negative error codes are expected to be sign-extended to the
native 64-bits, or they may fail to be matched correctly. Thus a
syscall which fails with an error may erroneously be identified as
failing.
* For ptrace, *all* compat return values should be sign-extended for
consistency with 32-bit arm, but we currently only do this for
negative return codes.
* As we may transiently set the upper 32 bits of some compat GPRs while
in the kernel, these can be sampled by perf, which is somewhat
confusing. This means that where a syscall returns a pointer above 2G,
this will be sign-extended, but will not be mistaken for an error as
error codes are constrained to the inclusive range [-4096, -1] where
no user pointer can exist.
To fix all of these, we must consistently use helpers to get/set the
compat GPRs, ensuring that we never write the upper 32 bits of the
return code, and always sign-extend when reading the return code. This
patch does so, with the following changes:
* We re-organise syscall_get_return_value() to always sign-extend for
compat tasks, and reimplement syscall_get_error() atop. We update
syscall_trace_exit() to use syscall_get_return_value().
* We consistently use syscall_set_return_value() to set the return
value, ensureing the upper 32 bits are never set unexpectedly.
* As the core audit code currently uses regs_return_value() rather than
syscall_get_return_value(), we special-case this for
compat_user_mode(regs) such that this will do the right thing. Going
forward, we should try to move the core audit code over to
syscall_get_return_value().
Cc: <stable@vger.kernel.org>
Reported-by: He Zhe <zhe.he@windriver.com>
Reported-by: weiyuchen <weiyuchen3@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210802104200.21390-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-08-02 11:42:00 +01:00
|
|
|
syscall_set_return_value(current, regs, -EINTR, 0);
|
2012-03-05 11:49:31 +00:00
|
|
|
regs->pc = continue_addr;
|
|
|
|
}
|
|
|
|
|
2013-10-06 22:52:44 +02:00
|
|
|
handle_signal(&ksig, regs);
|
2012-03-05 11:49:31 +00:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Handle restarting a different system call. As above, if a debugger
|
|
|
|
* has chosen to restart at a different PC, ignore the restart.
|
|
|
|
*/
|
2018-06-07 12:32:05 +01:00
|
|
|
if (syscall && regs->pc == restart_addr) {
|
2012-03-05 11:49:31 +00:00
|
|
|
if (retval == -ERESTART_RESTARTBLOCK)
|
|
|
|
setup_restart_syscall(regs);
|
|
|
|
user_rewind_single_step(current);
|
|
|
|
}
|
|
|
|
|
|
|
|
restore_saved_sigmask();
|
|
|
|
}
|
|
|
|
|
arm64: signal: Report signal frame size to userspace via auxv
Stateful CPU architecture extensions may require the signal frame
to grow to a size that exceeds the arch's MINSIGSTKSZ #define.
However, changing this #define is an ABI break.
To allow userspace the option of determining the signal frame size
in a more forwards-compatible way, this patch adds a new auxv entry
tagged with AT_MINSIGSTKSZ, which provides the maximum signal frame
size that the process can observe during its lifetime.
If AT_MINSIGSTKSZ is absent from the aux vector, the caller can
assume that the MINSIGSTKSZ #define is sufficient. This allows for
a consistent interface with older kernels that do not provide
AT_MINSIGSTKSZ.
The idea is that libc could expose this via sysconf() or some
similar mechanism.
There is deliberately no AT_SIGSTKSZ. The kernel knows nothing
about userspace's own stack overheads and should not pretend to
know.
For arm64:
The primary motivation for this interface is the Scalable Vector
Extension, which can require at least 4KB or so of extra space
in the signal frame for the largest hardware implementations.
To determine the correct value, a "Christmas tree" mode (via the
add_all argument) is added to setup_sigframe_layout(), to simulate
addition of all possible records to the signal frame at maximum
possible size.
If this procedure goes wrong somehow, resulting in a stupidly large
frame layout and hence failure of sigframe_alloc() to allocate a
record to the frame, then this is indicative of a kernel bug. In
this case, we WARN() and no attempt is made to populate
AT_MINSIGSTKSZ for userspace.
For arm64 SVE:
The SVE context block in the signal frame needs to be considered
too when computing the maximum possible signal frame size.
Because the size of this block depends on the vector length, this
patch computes the size based not on the thread's current vector
length but instead on the maximum possible vector length: this
determines the maximum size of SVE context block that can be
observed in any signal frame for the lifetime of the process.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-06-01 11:10:14 +01:00
|
|
|
unsigned long __ro_after_init signal_minsigstksz;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Determine the stack space required for guaranteed signal devliery.
|
|
|
|
* This function is used to populate AT_MINSIGSTKSZ at process startup.
|
|
|
|
* cpufeatures setup is assumed to be complete.
|
|
|
|
*/
|
|
|
|
void __init minsigstksz_setup(void)
|
|
|
|
{
|
|
|
|
struct rt_sigframe_user_layout user;
|
|
|
|
|
|
|
|
init_user_layout(&user);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If this fails, SIGFRAME_MAXSZ needs to be enlarged. It won't
|
|
|
|
* be big enough, but it's our best guess:
|
|
|
|
*/
|
|
|
|
if (WARN_ON(setup_sigframe_layout(&user, true)))
|
|
|
|
return;
|
|
|
|
|
|
|
|
signal_minsigstksz = sigframe_size(&user) +
|
|
|
|
round_up(sizeof(struct frame_record), 16) +
|
|
|
|
16; /* max alignment padding */
|
|
|
|
}
|
2021-04-29 21:07:34 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Compile-time assertions for siginfo_t offsets. Check NSIG* as well, as
|
|
|
|
* changes likely come with new fields that should be added below.
|
|
|
|
*/
|
|
|
|
static_assert(NSIGILL == 11);
|
|
|
|
static_assert(NSIGFPE == 15);
|
2023-06-12 17:10:53 -07:00
|
|
|
static_assert(NSIGSEGV == 10);
|
2021-04-29 21:07:34 +02:00
|
|
|
static_assert(NSIGBUS == 5);
|
|
|
|
static_assert(NSIGTRAP == 6);
|
|
|
|
static_assert(NSIGCHLD == 6);
|
|
|
|
static_assert(NSIGSYS == 2);
|
2021-05-04 11:25:22 -05:00
|
|
|
static_assert(sizeof(siginfo_t) == 128);
|
|
|
|
static_assert(__alignof__(siginfo_t) == 8);
|
2021-04-29 21:07:34 +02:00
|
|
|
static_assert(offsetof(siginfo_t, si_signo) == 0x00);
|
|
|
|
static_assert(offsetof(siginfo_t, si_errno) == 0x04);
|
|
|
|
static_assert(offsetof(siginfo_t, si_code) == 0x08);
|
|
|
|
static_assert(offsetof(siginfo_t, si_pid) == 0x10);
|
|
|
|
static_assert(offsetof(siginfo_t, si_uid) == 0x14);
|
|
|
|
static_assert(offsetof(siginfo_t, si_tid) == 0x10);
|
|
|
|
static_assert(offsetof(siginfo_t, si_overrun) == 0x14);
|
|
|
|
static_assert(offsetof(siginfo_t, si_status) == 0x18);
|
|
|
|
static_assert(offsetof(siginfo_t, si_utime) == 0x20);
|
|
|
|
static_assert(offsetof(siginfo_t, si_stime) == 0x28);
|
|
|
|
static_assert(offsetof(siginfo_t, si_value) == 0x18);
|
|
|
|
static_assert(offsetof(siginfo_t, si_int) == 0x18);
|
|
|
|
static_assert(offsetof(siginfo_t, si_ptr) == 0x18);
|
|
|
|
static_assert(offsetof(siginfo_t, si_addr) == 0x10);
|
|
|
|
static_assert(offsetof(siginfo_t, si_addr_lsb) == 0x18);
|
|
|
|
static_assert(offsetof(siginfo_t, si_lower) == 0x20);
|
|
|
|
static_assert(offsetof(siginfo_t, si_upper) == 0x28);
|
|
|
|
static_assert(offsetof(siginfo_t, si_pkey) == 0x20);
|
|
|
|
static_assert(offsetof(siginfo_t, si_perf_data) == 0x18);
|
|
|
|
static_assert(offsetof(siginfo_t, si_perf_type) == 0x20);
|
signal: Deliver SIGTRAP on perf event asynchronously if blocked
With SIGTRAP on perf events, we have encountered termination of
processes due to user space attempting to block delivery of SIGTRAP.
Consider this case:
<set up SIGTRAP on a perf event>
...
sigset_t s;
sigemptyset(&s);
sigaddset(&s, SIGTRAP | <and others>);
sigprocmask(SIG_BLOCK, &s, ...);
...
<perf event triggers>
When the perf event triggers, while SIGTRAP is blocked, force_sig_perf()
will force the signal, but revert back to the default handler, thus
terminating the task.
This makes sense for error conditions, but not so much for explicitly
requested monitoring. However, the expectation is still that signals
generated by perf events are synchronous, which will no longer be the
case if the signal is blocked and delivered later.
To give user space the ability to clearly distinguish synchronous from
asynchronous signals, introduce siginfo_t::si_perf_flags and
TRAP_PERF_FLAG_ASYNC (opted for flags in case more binary information is
required in future).
The resolution to the problem is then to (a) no longer force the signal
(avoiding the terminations), but (b) tell user space via si_perf_flags
if the signal was synchronous or not, so that such signals can be
handled differently (e.g. let user space decide to ignore or consider
the data imprecise).
The alternative of making the kernel ignore SIGTRAP on perf events if
the signal is blocked may work for some usecases, but likely causes
issues in others that then have to revert back to interception of
sigprocmask() (which we want to avoid). [ A concrete example: when using
breakpoint perf events to track data-flow, in a region of code where
signals are blocked, data-flow can no longer be tracked accurately.
When a relevant asynchronous signal is received after unblocking the
signal, the data-flow tracking logic needs to know its state is
imprecise. ]
Fixes: 97ba62b27867 ("perf: Add support for SIGTRAP on perf events")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://lore.kernel.org/r/20220404111204.935357-1-elver@google.com
2022-04-04 13:12:04 +02:00
|
|
|
static_assert(offsetof(siginfo_t, si_perf_flags) == 0x24);
|
2021-04-29 21:07:34 +02:00
|
|
|
static_assert(offsetof(siginfo_t, si_band) == 0x10);
|
|
|
|
static_assert(offsetof(siginfo_t, si_fd) == 0x18);
|
|
|
|
static_assert(offsetof(siginfo_t, si_call_addr) == 0x10);
|
|
|
|
static_assert(offsetof(siginfo_t, si_syscall) == 0x18);
|
|
|
|
static_assert(offsetof(siginfo_t, si_arch) == 0x1c);
|