2019-06-03 07:44:50 +02:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0-only */
|
2014-01-07 22:17:13 +08:00
|
|
|
/*
|
|
|
|
* Copyright (C) 2013 Huawei Ltd.
|
|
|
|
* Author: Jiang Liu <liuj97@gmail.com>
|
|
|
|
*
|
|
|
|
* Based on arch/arm/include/asm/jump_label.h
|
|
|
|
*/
|
|
|
|
#ifndef __ASM_JUMP_LABEL_H
|
|
|
|
#define __ASM_JUMP_LABEL_H
|
2015-04-09 13:51:30 +10:00
|
|
|
|
|
|
|
#ifndef __ASSEMBLY__
|
|
|
|
|
2014-01-07 22:17:13 +08:00
|
|
|
#include <linux/types.h>
|
|
|
|
#include <asm/insn.h>
|
|
|
|
|
arm64: jump_label: Ensure patched jump_labels are visible to all CPUs
Although the Arm architecture permits concurrent modification and
execution of NOP and branch instructions, it still requires some
synchronisation to ensure that other CPUs consistently execute the newly
written instruction:
> When the modified instructions are observable, each PE that is
> executing the modified instructions must execute an ISB or perform a
> context synchronizing event to ensure execution of the modified
> instructions
Prior to commit f6cc0c501649 ("arm64: Avoid calling stop_machine() when
patching jump labels"), the arm64 jump_label patching machinery
performed synchronisation using stop_machine() after each modification,
however this was problematic when flipping static keys from atomic
contexts (namely, the arm_arch_timer CPU hotplug startup notifier) and
so we switched to the _nosync() patching routines to avoid "scheduling
while atomic" BUG()s during boot.
In hindsight, the analysis of the issue in f6cc0c501649 isn't quite
right: it cites the use of IPIs in the default patching routines as the
cause of the lockup, whereas stop_machine() does not rely on IPIs and
the I-cache invalidation is performed using __flush_icache_range(),
which elides the call to kick_all_cpus_sync(). In fact, the blocking
wait for other CPUs is what triggers the BUG() and the problem remains
even after f6cc0c501649, for example because we could block on the
jump_label_mutex. Eventually, the arm_arch_timer driver was fixed to
avoid the static key entirely in commit a862fc2254bd
("clocksource/arm_arch_timer: Remove use of workaround static key").
This all leaves the jump_label patching code in a funny situation on
arm64 as we do not synchronise with other CPUs to reduce the likelihood
of a bug which no longer exists. Consequently, toggling a static key on
one CPU cannot be assumed to take effect on other CPUs, leading to
potential issues, for example with missing preempt notifiers.
Rather than revert f6cc0c501649 and go back to stop_machine() for each
patch site, implement arch_jump_label_transform_apply() and kick all
the other CPUs with an IPI at the end of patching.
Cc: Alexander Potapenko <glider@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Fixes: f6cc0c501649 ("arm64: Avoid calling stop_machine() when patching jump labels")
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240731133601.3073-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-07-31 14:36:01 +01:00
|
|
|
#define HAVE_JUMP_LABEL_BATCH
|
2014-01-07 22:17:13 +08:00
|
|
|
#define JUMP_LABEL_NOP_SIZE AARCH64_INSN_SIZE
|
|
|
|
|
2024-04-30 16:56:55 +08:00
|
|
|
#define JUMP_TABLE_ENTRY(key, label) \
|
|
|
|
".pushsection __jump_table, \"aw\"\n\t" \
|
|
|
|
".align 3\n\t" \
|
2024-10-30 16:04:27 +00:00
|
|
|
".long 1b - ., " label " - .\n\t" \
|
|
|
|
".quad " key " - .\n\t" \
|
|
|
|
".popsection\n\t"
|
|
|
|
|
|
|
|
/* This macro is also expanded on the Rust side. */
|
|
|
|
#define ARCH_STATIC_BRANCH_ASM(key, label) \
|
|
|
|
"1: nop\n\t" \
|
|
|
|
JUMP_TABLE_ENTRY(key, label)
|
2024-04-30 16:56:55 +08:00
|
|
|
|
2022-10-06 15:55:41 +08:00
|
|
|
static __always_inline bool arch_static_branch(struct static_key * const key,
|
|
|
|
const bool branch)
|
2014-01-07 22:17:13 +08:00
|
|
|
{
|
2024-04-30 16:56:55 +08:00
|
|
|
char *k = &((char *)key)[branch];
|
|
|
|
|
work around gcc bugs with 'asm goto' with outputs
We've had issues with gcc and 'asm goto' before, and we created a
'asm_volatile_goto()' macro for that in the past: see commits
3f0116c3238a ("compiler/gcc4: Add quirk for 'asm goto' miscompilation
bug") and a9f180345f53 ("compiler/gcc4: Make quirk for
asm_volatile_goto() unconditional").
Then, much later, we ended up removing the workaround in commit
43c249ea0b1e ("compiler-gcc.h: remove ancient workaround for gcc PR
58670") because we no longer supported building the kernel with the
affected gcc versions, but we left the macro uses around.
Now, Sean Christopherson reports a new version of a very similar
problem, which is fixed by re-applying that ancient workaround. But the
problem in question is limited to only the 'asm goto with outputs'
cases, so instead of re-introducing the old workaround as-is, let's
rename and limit the workaround to just that much less common case.
It looks like there are at least two separate issues that all hit in
this area:
(a) some versions of gcc don't mark the asm goto as 'volatile' when it
has outputs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420
which is easy to work around by just adding the 'volatile' by hand.
(b) Internal compiler errors:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422
which are worked around by adding the extra empty 'asm' as a
barrier, as in the original workaround.
but the problem Sean sees may be a third thing since it involves bad
code generation (not an ICE) even with the manually added 'volatile'.
but the same old workaround works for this case, even if this feels a
bit like voodoo programming and may only be hiding the issue.
Reported-and-tested-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Uros Bizjak <ubizjak@gmail.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Andrew Pinski <quic_apinski@quicinc.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-02-09 12:39:31 -08:00
|
|
|
asm goto(
|
2024-10-30 16:04:27 +00:00
|
|
|
ARCH_STATIC_BRANCH_ASM("%c0", "%l[l_yes]")
|
|
|
|
: : "i"(k) : : l_yes
|
2024-04-30 16:56:55 +08:00
|
|
|
);
|
2015-07-24 15:09:55 +02:00
|
|
|
|
|
|
|
return false;
|
|
|
|
l_yes:
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2022-10-06 15:55:41 +08:00
|
|
|
static __always_inline bool arch_static_branch_jump(struct static_key * const key,
|
|
|
|
const bool branch)
|
2015-07-24 15:09:55 +02:00
|
|
|
{
|
2024-04-30 16:56:55 +08:00
|
|
|
char *k = &((char *)key)[branch];
|
2024-10-30 16:04:27 +00:00
|
|
|
|
work around gcc bugs with 'asm goto' with outputs
We've had issues with gcc and 'asm goto' before, and we created a
'asm_volatile_goto()' macro for that in the past: see commits
3f0116c3238a ("compiler/gcc4: Add quirk for 'asm goto' miscompilation
bug") and a9f180345f53 ("compiler/gcc4: Make quirk for
asm_volatile_goto() unconditional").
Then, much later, we ended up removing the workaround in commit
43c249ea0b1e ("compiler-gcc.h: remove ancient workaround for gcc PR
58670") because we no longer supported building the kernel with the
affected gcc versions, but we left the macro uses around.
Now, Sean Christopherson reports a new version of a very similar
problem, which is fixed by re-applying that ancient workaround. But the
problem in question is limited to only the 'asm goto with outputs'
cases, so instead of re-introducing the old workaround as-is, let's
rename and limit the workaround to just that much less common case.
It looks like there are at least two separate issues that all hit in
this area:
(a) some versions of gcc don't mark the asm goto as 'volatile' when it
has outputs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420
which is easy to work around by just adding the 'volatile' by hand.
(b) Internal compiler errors:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422
which are worked around by adding the extra empty 'asm' as a
barrier, as in the original workaround.
but the problem Sean sees may be a third thing since it involves bad
code generation (not an ICE) even with the manually added 'volatile'.
but the same old workaround works for this case, even if this feels a
bit like voodoo programming and may only be hiding the issue.
Reported-and-tested-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Uros Bizjak <ubizjak@gmail.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Andrew Pinski <quic_apinski@quicinc.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-02-09 12:39:31 -08:00
|
|
|
asm goto(
|
arm64/kernel: jump_label: Switch to relative references
On a randomly chosen distro kernel build for arm64, vmlinux.o shows the
following sections, containing jump label entries, and the associated
RELA relocation records, respectively:
...
[38088] __jump_table PROGBITS 0000000000000000 00e19f30
000000000002ea10 0000000000000000 WA 0 0 8
[38089] .rela__jump_table RELA 0000000000000000 01fd8bb0
000000000008be30 0000000000000018 I 38178 38088 8
...
In other words, we have 190 KB worth of 'struct jump_entry' instances,
and 573 KB worth of RELA entries to relocate each entry's code, target
and key members. This means the RELA section occupies 10% of the .init
segment, and the two sections combined represent 5% of vmlinux's entire
memory footprint.
So let's switch from 64-bit absolute references to 32-bit relative
references for the code and target field, and a 64-bit relative
reference for the 'key' field (which may reside in another module or the
core kernel, which may be more than 4 GB way on arm64 when running with
KASLR enable): this reduces the size of the __jump_table by 33%, and
gets rid of the RELA section entirely.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Jessica Yu <jeyu@kernel.org>
Link: https://lkml.kernel.org/r/20180919065144.25010-4-ard.biesheuvel@linaro.org
2018-09-18 23:51:38 -07:00
|
|
|
"1: b %l[l_yes] \n\t"
|
2024-10-30 16:04:27 +00:00
|
|
|
JUMP_TABLE_ENTRY("%c0", "%l[l_yes]")
|
|
|
|
: : "i"(k) : : l_yes
|
2024-04-30 16:56:55 +08:00
|
|
|
);
|
2014-01-07 22:17:13 +08:00
|
|
|
return false;
|
|
|
|
l_yes:
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2015-04-09 13:51:30 +10:00
|
|
|
#endif /* __ASSEMBLY__ */
|
2014-01-07 22:17:13 +08:00
|
|
|
#endif /* __ASM_JUMP_LABEL_H */
|