linux/arch/arm64/include/asm/irqflags.h

200 lines
4.3 KiB
C
Raw Permalink Normal View History

/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2012 ARM Ltd.
*/
#ifndef __ASM_IRQFLAGS_H
#define __ASM_IRQFLAGS_H
#include <asm/barrier.h>
#include <asm/ptrace.h>
#include <asm/sysreg.h>
/*
* Aarch64 has flags for masking: Debug, Asynchronous (serror), Interrupts and
* FIQ exceptions, in the 'daif' register. We mask and unmask them in 'daif'
* order:
* Masking debug exceptions causes all other exceptions to be masked too/
* Masking SError masks IRQ/FIQ, but not debug exceptions. IRQ and FIQ are
* always masked and unmasked together, and have no side effects for other
* flags. Keeping to this order makes it easier for entry.S to know which
* exceptions should be unmasked.
*/
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
static __always_inline void __daif_local_irq_enable(void)
{
barrier();
asm volatile("msr daifclr, #3");
barrier();
}
static __always_inline void __pmr_local_irq_enable(void)
{
if (IS_ENABLED(CONFIG_ARM64_DEBUG_PRIORITY_MASKING)) {
u32 pmr = read_sysreg_s(SYS_ICC_PMR_EL1);
WARN_ON_ONCE(pmr != GIC_PRIO_IRQON && pmr != GIC_PRIO_IRQOFF);
}
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
barrier();
write_sysreg_s(GIC_PRIO_IRQON, SYS_ICC_PMR_EL1);
pmr_sync();
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
barrier();
}
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
static inline void arch_local_irq_enable(void)
{
arm64: Avoid cpus_have_const_cap() for ARM64_HAS_GIC_PRIO_MASKING In system_uses_irq_prio_masking() we use cpus_have_const_cap() to check for ARM64_HAS_GIC_PRIO_MASKING, but this is not necessary and alternative_has_cap_*() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. When CONFIG_ARM64_PSEUDO_NMI=y the ARM64_HAS_GIC_PRIO_MASKING cpucap is a strict boot cpu feature which is detected and patched early on the boot cpu, which both happen in smp_prepare_boot_cpu(). In the window between the ARM64_HAS_GIC_PRIO_MASKING cpucap is detected and alternatives are patched we don't run any code that depends upon the ARM64_HAS_GIC_PRIO_MASKING cpucap: * We leave DAIF.IF set until after boot alternatives are patched, and interrupts are unmasked later in init_IRQ(), so we cannot reach IRQ/FIQ entry code and will not use irqs_priority_unmasked(). * We don't call any code which uses arm_cpuidle_save_irq_context() and arm_cpuidle_restore_irq_context() during this window. * We don't call start_thread_common() during this window. * The local_irq_*() code in <asm/irqflags.h> depends solely on an alternative branch since commit: a5f61cc636f48bdf ("arm64: irqflags: use alternative branches for pseudo-NMI logic") ... and hence will use the default (DAIF-only) masking behaviour until alternatives are patched. * Secondary CPUs are brought up later after alternatives are patched, and alternatives are patched on the boot CPU immediately prior to calling init_gic_priority_masking(), so we'll correctly initialize interrupt masking regardless. This patch replaces the use of cpus_have_const_cap() with alternative_has_cap_unlikely(), which avoid generating code to test the system_cpucaps bitmap and should be better for all subsequent calls at runtime. As this makes system_uses_irq_prio_masking() equivalent to __irqflags_uses_pmr(), the latter is removed and replaced with the former for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 11:24:43 +01:00
if (system_uses_irq_prio_masking()) {
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
__pmr_local_irq_enable();
} else {
__daif_local_irq_enable();
}
}
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
static __always_inline void __daif_local_irq_disable(void)
{
barrier();
asm volatile("msr daifset, #3");
barrier();
}
static __always_inline void __pmr_local_irq_disable(void)
{
if (IS_ENABLED(CONFIG_ARM64_DEBUG_PRIORITY_MASKING)) {
u32 pmr = read_sysreg_s(SYS_ICC_PMR_EL1);
WARN_ON_ONCE(pmr != GIC_PRIO_IRQON && pmr != GIC_PRIO_IRQOFF);
}
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
barrier();
write_sysreg_s(GIC_PRIO_IRQOFF, SYS_ICC_PMR_EL1);
barrier();
}
static inline void arch_local_irq_disable(void)
{
arm64: Avoid cpus_have_const_cap() for ARM64_HAS_GIC_PRIO_MASKING In system_uses_irq_prio_masking() we use cpus_have_const_cap() to check for ARM64_HAS_GIC_PRIO_MASKING, but this is not necessary and alternative_has_cap_*() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. When CONFIG_ARM64_PSEUDO_NMI=y the ARM64_HAS_GIC_PRIO_MASKING cpucap is a strict boot cpu feature which is detected and patched early on the boot cpu, which both happen in smp_prepare_boot_cpu(). In the window between the ARM64_HAS_GIC_PRIO_MASKING cpucap is detected and alternatives are patched we don't run any code that depends upon the ARM64_HAS_GIC_PRIO_MASKING cpucap: * We leave DAIF.IF set until after boot alternatives are patched, and interrupts are unmasked later in init_IRQ(), so we cannot reach IRQ/FIQ entry code and will not use irqs_priority_unmasked(). * We don't call any code which uses arm_cpuidle_save_irq_context() and arm_cpuidle_restore_irq_context() during this window. * We don't call start_thread_common() during this window. * The local_irq_*() code in <asm/irqflags.h> depends solely on an alternative branch since commit: a5f61cc636f48bdf ("arm64: irqflags: use alternative branches for pseudo-NMI logic") ... and hence will use the default (DAIF-only) masking behaviour until alternatives are patched. * Secondary CPUs are brought up later after alternatives are patched, and alternatives are patched on the boot CPU immediately prior to calling init_gic_priority_masking(), so we'll correctly initialize interrupt masking regardless. This patch replaces the use of cpus_have_const_cap() with alternative_has_cap_unlikely(), which avoid generating code to test the system_cpucaps bitmap and should be better for all subsequent calls at runtime. As this makes system_uses_irq_prio_masking() equivalent to __irqflags_uses_pmr(), the latter is removed and replaced with the former for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 11:24:43 +01:00
if (system_uses_irq_prio_masking()) {
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
__pmr_local_irq_disable();
} else {
__daif_local_irq_disable();
}
}
static __always_inline unsigned long __daif_local_save_flags(void)
{
return read_sysreg(daif);
}
static __always_inline unsigned long __pmr_local_save_flags(void)
{
return read_sysreg_s(SYS_ICC_PMR_EL1);
}
/*
* Save the current interrupt enable state.
*/
static inline unsigned long arch_local_save_flags(void)
{
arm64: Avoid cpus_have_const_cap() for ARM64_HAS_GIC_PRIO_MASKING In system_uses_irq_prio_masking() we use cpus_have_const_cap() to check for ARM64_HAS_GIC_PRIO_MASKING, but this is not necessary and alternative_has_cap_*() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. When CONFIG_ARM64_PSEUDO_NMI=y the ARM64_HAS_GIC_PRIO_MASKING cpucap is a strict boot cpu feature which is detected and patched early on the boot cpu, which both happen in smp_prepare_boot_cpu(). In the window between the ARM64_HAS_GIC_PRIO_MASKING cpucap is detected and alternatives are patched we don't run any code that depends upon the ARM64_HAS_GIC_PRIO_MASKING cpucap: * We leave DAIF.IF set until after boot alternatives are patched, and interrupts are unmasked later in init_IRQ(), so we cannot reach IRQ/FIQ entry code and will not use irqs_priority_unmasked(). * We don't call any code which uses arm_cpuidle_save_irq_context() and arm_cpuidle_restore_irq_context() during this window. * We don't call start_thread_common() during this window. * The local_irq_*() code in <asm/irqflags.h> depends solely on an alternative branch since commit: a5f61cc636f48bdf ("arm64: irqflags: use alternative branches for pseudo-NMI logic") ... and hence will use the default (DAIF-only) masking behaviour until alternatives are patched. * Secondary CPUs are brought up later after alternatives are patched, and alternatives are patched on the boot CPU immediately prior to calling init_gic_priority_masking(), so we'll correctly initialize interrupt masking regardless. This patch replaces the use of cpus_have_const_cap() with alternative_has_cap_unlikely(), which avoid generating code to test the system_cpucaps bitmap and should be better for all subsequent calls at runtime. As this makes system_uses_irq_prio_masking() equivalent to __irqflags_uses_pmr(), the latter is removed and replaced with the former for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 11:24:43 +01:00
if (system_uses_irq_prio_masking()) {
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
return __pmr_local_save_flags();
} else {
return __daif_local_save_flags();
}
}
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
static __always_inline bool __daif_irqs_disabled_flags(unsigned long flags)
{
return flags & PSR_I_BIT;
}
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
static __always_inline bool __pmr_irqs_disabled_flags(unsigned long flags)
{
return flags != GIC_PRIO_IRQON;
}
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
static inline bool arch_irqs_disabled_flags(unsigned long flags)
arm64: Fix incorrect irqflag restore for priority masking When using IRQ priority masking to disable interrupts, in order to deal with the PSR.I state, local_irq_save() would convert the I bit into a PMR value (GIC_PRIO_IRQOFF). This resulted in local_irq_restore() potentially modifying the value of PMR in undesired location due to the state of PSR.I upon flag saving [1]. In an attempt to solve this issue in a less hackish manner, introduce a bit (GIC_PRIO_IGNORE_PMR) for the PMR values that can represent whether PSR.I is being used to disable interrupts, in which case it takes precedence of the status of interrupt masking via PMR. GIC_PRIO_PSR_I_SET is chosen such that (<pmr_value> | GIC_PRIO_PSR_I_SET) does not mask more interrupts than <pmr_value> as some sections (e.g. arch_cpu_idle(), interrupt acknowledge path) requires PMR not to mask interrupts that could be signaled to the CPU when using only PSR.I. [1] https://www.spinics.net/lists/arm-kernel/msg716956.html Fixes: 4a503217ce37 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking") Cc: <stable@vger.kernel.org> # 5.1.x- Reported-by: Zenghui Yu <yuzenghui@huawei.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wei Li <liwei391@huawei.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Pouloze <suzuki.poulose@arm.com> Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-11 10:38:10 +01:00
{
arm64: Avoid cpus_have_const_cap() for ARM64_HAS_GIC_PRIO_MASKING In system_uses_irq_prio_masking() we use cpus_have_const_cap() to check for ARM64_HAS_GIC_PRIO_MASKING, but this is not necessary and alternative_has_cap_*() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. When CONFIG_ARM64_PSEUDO_NMI=y the ARM64_HAS_GIC_PRIO_MASKING cpucap is a strict boot cpu feature which is detected and patched early on the boot cpu, which both happen in smp_prepare_boot_cpu(). In the window between the ARM64_HAS_GIC_PRIO_MASKING cpucap is detected and alternatives are patched we don't run any code that depends upon the ARM64_HAS_GIC_PRIO_MASKING cpucap: * We leave DAIF.IF set until after boot alternatives are patched, and interrupts are unmasked later in init_IRQ(), so we cannot reach IRQ/FIQ entry code and will not use irqs_priority_unmasked(). * We don't call any code which uses arm_cpuidle_save_irq_context() and arm_cpuidle_restore_irq_context() during this window. * We don't call start_thread_common() during this window. * The local_irq_*() code in <asm/irqflags.h> depends solely on an alternative branch since commit: a5f61cc636f48bdf ("arm64: irqflags: use alternative branches for pseudo-NMI logic") ... and hence will use the default (DAIF-only) masking behaviour until alternatives are patched. * Secondary CPUs are brought up later after alternatives are patched, and alternatives are patched on the boot CPU immediately prior to calling init_gic_priority_masking(), so we'll correctly initialize interrupt masking regardless. This patch replaces the use of cpus_have_const_cap() with alternative_has_cap_unlikely(), which avoid generating code to test the system_cpucaps bitmap and should be better for all subsequent calls at runtime. As this makes system_uses_irq_prio_masking() equivalent to __irqflags_uses_pmr(), the latter is removed and replaced with the former for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 11:24:43 +01:00
if (system_uses_irq_prio_masking()) {
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
return __pmr_irqs_disabled_flags(flags);
} else {
return __daif_irqs_disabled_flags(flags);
}
}
arm64: Fix incorrect irqflag restore for priority masking When using IRQ priority masking to disable interrupts, in order to deal with the PSR.I state, local_irq_save() would convert the I bit into a PMR value (GIC_PRIO_IRQOFF). This resulted in local_irq_restore() potentially modifying the value of PMR in undesired location due to the state of PSR.I upon flag saving [1]. In an attempt to solve this issue in a less hackish manner, introduce a bit (GIC_PRIO_IGNORE_PMR) for the PMR values that can represent whether PSR.I is being used to disable interrupts, in which case it takes precedence of the status of interrupt masking via PMR. GIC_PRIO_PSR_I_SET is chosen such that (<pmr_value> | GIC_PRIO_PSR_I_SET) does not mask more interrupts than <pmr_value> as some sections (e.g. arch_cpu_idle(), interrupt acknowledge path) requires PMR not to mask interrupts that could be signaled to the CPU when using only PSR.I. [1] https://www.spinics.net/lists/arm-kernel/msg716956.html Fixes: 4a503217ce37 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking") Cc: <stable@vger.kernel.org> # 5.1.x- Reported-by: Zenghui Yu <yuzenghui@huawei.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wei Li <liwei391@huawei.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Pouloze <suzuki.poulose@arm.com> Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-11 10:38:10 +01:00
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
static __always_inline bool __daif_irqs_disabled(void)
{
return __daif_irqs_disabled_flags(__daif_local_save_flags());
}
arm64: Fix incorrect irqflag restore for priority masking When using IRQ priority masking to disable interrupts, in order to deal with the PSR.I state, local_irq_save() would convert the I bit into a PMR value (GIC_PRIO_IRQOFF). This resulted in local_irq_restore() potentially modifying the value of PMR in undesired location due to the state of PSR.I upon flag saving [1]. In an attempt to solve this issue in a less hackish manner, introduce a bit (GIC_PRIO_IGNORE_PMR) for the PMR values that can represent whether PSR.I is being used to disable interrupts, in which case it takes precedence of the status of interrupt masking via PMR. GIC_PRIO_PSR_I_SET is chosen such that (<pmr_value> | GIC_PRIO_PSR_I_SET) does not mask more interrupts than <pmr_value> as some sections (e.g. arch_cpu_idle(), interrupt acknowledge path) requires PMR not to mask interrupts that could be signaled to the CPU when using only PSR.I. [1] https://www.spinics.net/lists/arm-kernel/msg716956.html Fixes: 4a503217ce37 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking") Cc: <stable@vger.kernel.org> # 5.1.x- Reported-by: Zenghui Yu <yuzenghui@huawei.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wei Li <liwei391@huawei.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Pouloze <suzuki.poulose@arm.com> Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-11 10:38:10 +01:00
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
static __always_inline bool __pmr_irqs_disabled(void)
{
return __pmr_irqs_disabled_flags(__pmr_local_save_flags());
arm64: Fix incorrect irqflag restore for priority masking When using IRQ priority masking to disable interrupts, in order to deal with the PSR.I state, local_irq_save() would convert the I bit into a PMR value (GIC_PRIO_IRQOFF). This resulted in local_irq_restore() potentially modifying the value of PMR in undesired location due to the state of PSR.I upon flag saving [1]. In an attempt to solve this issue in a less hackish manner, introduce a bit (GIC_PRIO_IGNORE_PMR) for the PMR values that can represent whether PSR.I is being used to disable interrupts, in which case it takes precedence of the status of interrupt masking via PMR. GIC_PRIO_PSR_I_SET is chosen such that (<pmr_value> | GIC_PRIO_PSR_I_SET) does not mask more interrupts than <pmr_value> as some sections (e.g. arch_cpu_idle(), interrupt acknowledge path) requires PMR not to mask interrupts that could be signaled to the CPU when using only PSR.I. [1] https://www.spinics.net/lists/arm-kernel/msg716956.html Fixes: 4a503217ce37 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking") Cc: <stable@vger.kernel.org> # 5.1.x- Reported-by: Zenghui Yu <yuzenghui@huawei.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wei Li <liwei391@huawei.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Pouloze <suzuki.poulose@arm.com> Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-11 10:38:10 +01:00
}
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
static inline bool arch_irqs_disabled(void)
{
arm64: Avoid cpus_have_const_cap() for ARM64_HAS_GIC_PRIO_MASKING In system_uses_irq_prio_masking() we use cpus_have_const_cap() to check for ARM64_HAS_GIC_PRIO_MASKING, but this is not necessary and alternative_has_cap_*() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. When CONFIG_ARM64_PSEUDO_NMI=y the ARM64_HAS_GIC_PRIO_MASKING cpucap is a strict boot cpu feature which is detected and patched early on the boot cpu, which both happen in smp_prepare_boot_cpu(). In the window between the ARM64_HAS_GIC_PRIO_MASKING cpucap is detected and alternatives are patched we don't run any code that depends upon the ARM64_HAS_GIC_PRIO_MASKING cpucap: * We leave DAIF.IF set until after boot alternatives are patched, and interrupts are unmasked later in init_IRQ(), so we cannot reach IRQ/FIQ entry code and will not use irqs_priority_unmasked(). * We don't call any code which uses arm_cpuidle_save_irq_context() and arm_cpuidle_restore_irq_context() during this window. * We don't call start_thread_common() during this window. * The local_irq_*() code in <asm/irqflags.h> depends solely on an alternative branch since commit: a5f61cc636f48bdf ("arm64: irqflags: use alternative branches for pseudo-NMI logic") ... and hence will use the default (DAIF-only) masking behaviour until alternatives are patched. * Secondary CPUs are brought up later after alternatives are patched, and alternatives are patched on the boot CPU immediately prior to calling init_gic_priority_masking(), so we'll correctly initialize interrupt masking regardless. This patch replaces the use of cpus_have_const_cap() with alternative_has_cap_unlikely(), which avoid generating code to test the system_cpucaps bitmap and should be better for all subsequent calls at runtime. As this makes system_uses_irq_prio_masking() equivalent to __irqflags_uses_pmr(), the latter is removed and replaced with the former for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 11:24:43 +01:00
if (system_uses_irq_prio_masking()) {
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
return __pmr_irqs_disabled();
} else {
return __daif_irqs_disabled();
}
}
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
static __always_inline unsigned long __daif_local_irq_save(void)
{
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
unsigned long flags = __daif_local_save_flags();
__daif_local_irq_disable();
return flags;
}
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
static __always_inline unsigned long __pmr_local_irq_save(void)
{
unsigned long flags = __pmr_local_save_flags();
arm64: Fix incorrect irqflag restore for priority masking When using IRQ priority masking to disable interrupts, in order to deal with the PSR.I state, local_irq_save() would convert the I bit into a PMR value (GIC_PRIO_IRQOFF). This resulted in local_irq_restore() potentially modifying the value of PMR in undesired location due to the state of PSR.I upon flag saving [1]. In an attempt to solve this issue in a less hackish manner, introduce a bit (GIC_PRIO_IGNORE_PMR) for the PMR values that can represent whether PSR.I is being used to disable interrupts, in which case it takes precedence of the status of interrupt masking via PMR. GIC_PRIO_PSR_I_SET is chosen such that (<pmr_value> | GIC_PRIO_PSR_I_SET) does not mask more interrupts than <pmr_value> as some sections (e.g. arch_cpu_idle(), interrupt acknowledge path) requires PMR not to mask interrupts that could be signaled to the CPU when using only PSR.I. [1] https://www.spinics.net/lists/arm-kernel/msg716956.html Fixes: 4a503217ce37 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking") Cc: <stable@vger.kernel.org> # 5.1.x- Reported-by: Zenghui Yu <yuzenghui@huawei.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wei Li <liwei391@huawei.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Pouloze <suzuki.poulose@arm.com> Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Julien Thierry <julien.thierry@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-11 10:38:10 +01:00
/*
* There are too many states with IRQs disabled, just keep the current
* state if interrupts are already disabled/masked.
*/
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
if (!__pmr_irqs_disabled_flags(flags))
__pmr_local_irq_disable();
return flags;
}
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
static inline unsigned long arch_local_irq_save(void)
{
arm64: Avoid cpus_have_const_cap() for ARM64_HAS_GIC_PRIO_MASKING In system_uses_irq_prio_masking() we use cpus_have_const_cap() to check for ARM64_HAS_GIC_PRIO_MASKING, but this is not necessary and alternative_has_cap_*() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. When CONFIG_ARM64_PSEUDO_NMI=y the ARM64_HAS_GIC_PRIO_MASKING cpucap is a strict boot cpu feature which is detected and patched early on the boot cpu, which both happen in smp_prepare_boot_cpu(). In the window between the ARM64_HAS_GIC_PRIO_MASKING cpucap is detected and alternatives are patched we don't run any code that depends upon the ARM64_HAS_GIC_PRIO_MASKING cpucap: * We leave DAIF.IF set until after boot alternatives are patched, and interrupts are unmasked later in init_IRQ(), so we cannot reach IRQ/FIQ entry code and will not use irqs_priority_unmasked(). * We don't call any code which uses arm_cpuidle_save_irq_context() and arm_cpuidle_restore_irq_context() during this window. * We don't call start_thread_common() during this window. * The local_irq_*() code in <asm/irqflags.h> depends solely on an alternative branch since commit: a5f61cc636f48bdf ("arm64: irqflags: use alternative branches for pseudo-NMI logic") ... and hence will use the default (DAIF-only) masking behaviour until alternatives are patched. * Secondary CPUs are brought up later after alternatives are patched, and alternatives are patched on the boot CPU immediately prior to calling init_gic_priority_masking(), so we'll correctly initialize interrupt masking regardless. This patch replaces the use of cpus_have_const_cap() with alternative_has_cap_unlikely(), which avoid generating code to test the system_cpucaps bitmap and should be better for all subsequent calls at runtime. As this makes system_uses_irq_prio_masking() equivalent to __irqflags_uses_pmr(), the latter is removed and replaced with the former for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 11:24:43 +01:00
if (system_uses_irq_prio_masking()) {
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
return __pmr_local_irq_save();
} else {
return __daif_local_irq_save();
}
}
static __always_inline void __daif_local_irq_restore(unsigned long flags)
{
barrier();
write_sysreg(flags, daif);
barrier();
}
static __always_inline void __pmr_local_irq_restore(unsigned long flags)
{
barrier();
write_sysreg_s(flags, SYS_ICC_PMR_EL1);
pmr_sync();
barrier();
}
/*
* restore saved IRQ state
*/
static inline void arch_local_irq_restore(unsigned long flags)
{
arm64: Avoid cpus_have_const_cap() for ARM64_HAS_GIC_PRIO_MASKING In system_uses_irq_prio_masking() we use cpus_have_const_cap() to check for ARM64_HAS_GIC_PRIO_MASKING, but this is not necessary and alternative_has_cap_*() would be preferable. For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching. Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching. When CONFIG_ARM64_PSEUDO_NMI=y the ARM64_HAS_GIC_PRIO_MASKING cpucap is a strict boot cpu feature which is detected and patched early on the boot cpu, which both happen in smp_prepare_boot_cpu(). In the window between the ARM64_HAS_GIC_PRIO_MASKING cpucap is detected and alternatives are patched we don't run any code that depends upon the ARM64_HAS_GIC_PRIO_MASKING cpucap: * We leave DAIF.IF set until after boot alternatives are patched, and interrupts are unmasked later in init_IRQ(), so we cannot reach IRQ/FIQ entry code and will not use irqs_priority_unmasked(). * We don't call any code which uses arm_cpuidle_save_irq_context() and arm_cpuidle_restore_irq_context() during this window. * We don't call start_thread_common() during this window. * The local_irq_*() code in <asm/irqflags.h> depends solely on an alternative branch since commit: a5f61cc636f48bdf ("arm64: irqflags: use alternative branches for pseudo-NMI logic") ... and hence will use the default (DAIF-only) masking behaviour until alternatives are patched. * Secondary CPUs are brought up later after alternatives are patched, and alternatives are patched on the boot CPU immediately prior to calling init_gic_priority_masking(), so we'll correctly initialize interrupt masking regardless. This patch replaces the use of cpus_have_const_cap() with alternative_has_cap_unlikely(), which avoid generating code to test the system_cpucaps bitmap and should be better for all subsequent calls at runtime. As this makes system_uses_irq_prio_masking() equivalent to __irqflags_uses_pmr(), the latter is removed and replaced with the former for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 11:24:43 +01:00
if (system_uses_irq_prio_masking()) {
arm64: irqflags: use alternative branches for pseudo-NMI logic Due to the way we use alternatives in the irqflags code, even when CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for pseudo-NMI management. This patch reworks the irqflags code to remove the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the more common case, and will permit further rework of our DAIF management (e.g. in preparation for ARMv8.8-A's NMI feature). Prior to this patch a defconfig kernel has hundreds of redundant instructions to access ICC_PMR_EL1 (which should only need to be manipulated in setup code), which this patch removes: | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l | 885 | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l | 5 Those instructions alone account for more than 3KiB of kernel text, and will be associated with additional alt_instr entries, padding and branches, etc. These redundant instructions exist because we use alternative sequences for to choose between DAIF / PMR management in irqflags.h, and even when CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the code for PMR management, along with alt_instr entries. We use alternatives here as this was necessary to ensure that we never encounter a mismatched local_irq_save() ... local_irq_restore() sequence in the middle of patching, which was possible to see if we used static keys to choose between DAIF and PMR management. Since commit: 21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()") ... we have a mechanism to use alternatives similarly to static keys, allowing us to write the bulk of the logic in C code while also being able to rely on all sites being patched in one go, and avoiding a mismatched mismatched local_irq_save() ... local_irq_restore() sequence during patching. This patch rewrites arm64's local_irq_*() functions to use alternative branches. This allows for the pseudo-NMI code to be entirely elided when CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y: | [mark@lakrids:~/src/linux]% ls -al vmlinux-* | -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig | -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi | -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig | -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi | [mark@lakrids:~/src/linux]% ls -al Image-* | -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi | -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig | -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi Some sensitive code depends on being run with interrupts enabled or with interrupts disabled, and so when enabling or disabling interrupts we must ensure that the compiler does not move such code around the actual enable/disable. Before this patch, that was ensured by the combined asm volatile blocks having memory clobbers (and any sensitive code either being asm volatile, or touching memory). This patch consistently uses explicit barrier() operations before and after the enable/disable, which allows us to use the usual sysreg accessors (which are asm volatile) to manipulate the interrupt masks. The use of pmr_sync() is pulled within this critical section for consistency. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 14:54:29 +00:00
__pmr_local_irq_restore(flags);
} else {
__daif_local_irq_restore(flags);
}
}
#endif /* __ASM_IRQFLAGS_H */