License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 15:07:57 +01:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2010-09-17 15:36:40 -07:00
|
|
|
#ifndef _ASM_X86_MWAIT_H
|
|
|
|
#define _ASM_X86_MWAIT_H
|
|
|
|
|
2013-12-12 15:08:36 +01:00
|
|
|
#include <linux/sched.h>
|
2017-02-01 16:36:40 +01:00
|
|
|
#include <linux/sched/idle.h>
|
2013-12-12 15:08:36 +01:00
|
|
|
|
2016-01-26 22:12:04 +01:00
|
|
|
#include <asm/cpufeature.h>
|
2019-02-18 23:04:01 +01:00
|
|
|
#include <asm/nospec-branch.h>
|
2016-01-26 22:12:04 +01:00
|
|
|
|
2010-09-17 15:36:40 -07:00
|
|
|
#define MWAIT_SUBSTATE_MASK 0xf
|
|
|
|
#define MWAIT_CSTATE_MASK 0xf
|
|
|
|
#define MWAIT_SUBSTATE_SIZE 4
|
2013-02-01 23:37:30 -05:00
|
|
|
#define MWAIT_HINT2CSTATE(hint) (((hint) >> MWAIT_SUBSTATE_SIZE) & MWAIT_CSTATE_MASK)
|
|
|
|
#define MWAIT_HINT2SUBSTATE(hint) ((hint) & MWAIT_CSTATE_MASK)
|
2022-06-06 23:33:35 +05:30
|
|
|
#define MWAIT_C1_SUBSTATE_MASK 0xf0
|
2010-09-17 15:36:40 -07:00
|
|
|
|
|
|
|
#define CPUID5_ECX_EXTENSIONS_SUPPORTED 0x1
|
|
|
|
#define CPUID5_ECX_INTERRUPT_BREAK 0x2
|
|
|
|
|
|
|
|
#define MWAIT_ECX_INTERRUPT_BREAK 0x1
|
2015-08-10 12:19:53 +02:00
|
|
|
#define MWAITX_ECX_TIMER_ENABLE BIT(1)
|
2020-04-24 12:37:54 -07:00
|
|
|
#define MWAITX_MAX_WAIT_CYCLES UINT_MAX
|
2019-10-07 19:00:22 +00:00
|
|
|
#define MWAITX_DISABLE_CSTATES 0xf0
|
2020-04-24 12:37:56 -07:00
|
|
|
#define TPAUSE_C01_STATE 1
|
|
|
|
#define TPAUSE_C02_STATE 0
|
2010-09-17 15:36:40 -07:00
|
|
|
|
2025-04-02 20:08:05 +02:00
|
|
|
static __always_inline void __monitor(const void *eax, u32 ecx, u32 edx)
|
2013-12-12 15:08:36 +01:00
|
|
|
{
|
2025-04-03 14:50:45 +02:00
|
|
|
/*
|
|
|
|
* Use the instruction mnemonic with implicit operands, as the LLVM
|
|
|
|
* assembler fails to assemble the mnemonic with explicit operands:
|
|
|
|
*/
|
|
|
|
asm volatile("monitor" :: "a" (eax), "c" (ecx), "d" (edx));
|
2013-12-12 15:08:36 +01:00
|
|
|
}
|
|
|
|
|
2025-04-02 20:08:05 +02:00
|
|
|
static __always_inline void __monitorx(const void *eax, u32 ecx, u32 edx)
|
2015-08-10 12:19:53 +02:00
|
|
|
{
|
x86/idle: Remove .s output beautifying delimiters from simpler asm() templates
Delimiters in asm() templates such as ';', '\t' or '\n' are not
required syntactically, they were used historically in the Linux
kernel to prettify the compiler's .s output for people who were
looking at compiler generated .s output.
Most x86 developers these days are primarily looking at:
1) objdump --disassemble-all .o
2) perf top's live kernel function annotation and disassembler
feature that uses /dev/mem.
... because:
- this kind of assembler output is standardized regardless of
compiler used,
- it's generally less messy looking,
- it gives ground-truth instead of being some intermediate layer
in the toolchain that might or might not be the real deal,
- and on a live kernel it also sees through the kernel's various
layers of runtime patching code obfuscation facilities, also
known as: alternative-instructions, tracepoints and jump labels.
There are some cases where the .s output is the most useful
tool, such as alternatives() code generation, but other than
that these delimiters used in simple asm() statements mostly
add noise to the source code side, which isn't desirable for
assembly code that is fragile enough already.
Remove the delimiters for <asm/mwait.h>, which also happens to
make the GCC inliner's asm() instruction length heuristics
more accurate...
[ mingo: Wrote a new changelog to give historic context and
to give people a chance to object. :-) ]
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250402180827.3762-3-ubizjak@gmail.com
2025-04-02 20:08:07 +02:00
|
|
|
/* "monitorx %eax, %ecx, %edx" */
|
|
|
|
asm volatile(".byte 0x0f, 0x01, 0xfa"
|
2015-08-10 12:19:53 +02:00
|
|
|
:: "a" (eax), "c" (ecx), "d"(edx));
|
|
|
|
}
|
|
|
|
|
2025-04-02 20:08:05 +02:00
|
|
|
static __always_inline void __mwait(u32 eax, u32 ecx)
|
2013-12-12 15:08:36 +01:00
|
|
|
{
|
2025-04-03 14:50:45 +02:00
|
|
|
/*
|
|
|
|
* Use the instruction mnemonic with implicit operands, as the LLVM
|
|
|
|
* assembler fails to assemble the mnemonic with explicit operands:
|
|
|
|
*/
|
|
|
|
asm volatile("mwait" :: "a" (eax), "c" (ecx));
|
2013-12-12 15:08:36 +01:00
|
|
|
}
|
|
|
|
|
2015-08-10 12:19:53 +02:00
|
|
|
/*
|
|
|
|
* MWAITX allows for a timer expiration to get the core out a wait state in
|
|
|
|
* addition to the default MWAIT exit condition of a store appearing at a
|
|
|
|
* monitored virtual address.
|
|
|
|
*
|
|
|
|
* Registers:
|
|
|
|
*
|
|
|
|
* MWAITX ECX[1]: enable timer if set
|
|
|
|
* MWAITX EBX[31:0]: max wait time expressed in SW P0 clocks. The software P0
|
|
|
|
* frequency is the same as the TSC frequency.
|
|
|
|
*
|
|
|
|
* Below is a comparison between MWAIT and MWAITX on AMD processors:
|
|
|
|
*
|
|
|
|
* MWAIT MWAITX
|
|
|
|
* opcode 0f 01 c9 | 0f 01 fb
|
|
|
|
* ECX[0] value of RFLAGS.IF seen by instruction
|
|
|
|
* ECX[1] unused/#GP if set | enable timer if set
|
|
|
|
* ECX[31:2] unused/#GP if set
|
|
|
|
* EAX unused (reserve for hint)
|
|
|
|
* EBX[31:0] unused | max wait time (P0 clocks)
|
|
|
|
*
|
|
|
|
* MONITOR MONITORX
|
|
|
|
* opcode 0f 01 c8 | 0f 01 fa
|
|
|
|
* EAX (logical) address to monitor
|
|
|
|
* ECX #GP if not zero
|
|
|
|
*/
|
2025-04-02 20:08:05 +02:00
|
|
|
static __always_inline void __mwaitx(u32 eax, u32 ebx, u32 ecx)
|
2015-08-10 12:19:53 +02:00
|
|
|
{
|
2024-09-11 10:53:08 +02:00
|
|
|
/* No need for TSA buffer clearing on AMD */
|
2019-02-18 23:04:01 +01:00
|
|
|
|
x86/idle: Remove .s output beautifying delimiters from simpler asm() templates
Delimiters in asm() templates such as ';', '\t' or '\n' are not
required syntactically, they were used historically in the Linux
kernel to prettify the compiler's .s output for people who were
looking at compiler generated .s output.
Most x86 developers these days are primarily looking at:
1) objdump --disassemble-all .o
2) perf top's live kernel function annotation and disassembler
feature that uses /dev/mem.
... because:
- this kind of assembler output is standardized regardless of
compiler used,
- it's generally less messy looking,
- it gives ground-truth instead of being some intermediate layer
in the toolchain that might or might not be the real deal,
- and on a live kernel it also sees through the kernel's various
layers of runtime patching code obfuscation facilities, also
known as: alternative-instructions, tracepoints and jump labels.
There are some cases where the .s output is the most useful
tool, such as alternatives() code generation, but other than
that these delimiters used in simple asm() statements mostly
add noise to the source code side, which isn't desirable for
assembly code that is fragile enough already.
Remove the delimiters for <asm/mwait.h>, which also happens to
make the GCC inliner's asm() instruction length heuristics
more accurate...
[ mingo: Wrote a new changelog to give historic context and
to give people a chance to object. :-) ]
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250402180827.3762-3-ubizjak@gmail.com
2025-04-02 20:08:07 +02:00
|
|
|
/* "mwaitx %eax, %ebx, %ecx" */
|
|
|
|
asm volatile(".byte 0x0f, 0x01, 0xfb"
|
2015-08-10 12:19:53 +02:00
|
|
|
:: "a" (eax), "b" (ebx), "c" (ecx));
|
|
|
|
}
|
|
|
|
|
2023-11-15 10:13:22 -05:00
|
|
|
/*
|
|
|
|
* Re-enable interrupts right upon calling mwait in such a way that
|
|
|
|
* no interrupt can fire _before_ the execution of mwait, ie: no
|
|
|
|
* instruction must be placed between "sti" and "mwait".
|
|
|
|
*
|
|
|
|
* This is necessary because if an interrupt queues a timer before
|
|
|
|
* executing mwait, it would otherwise go unnoticed and the next tick
|
|
|
|
* would not be reprogrammed accordingly before mwait ever wakes up.
|
|
|
|
*/
|
2025-04-02 20:08:05 +02:00
|
|
|
static __always_inline void __sti_mwait(u32 eax, u32 ecx)
|
sched/idle/x86: Restore mwait_idle() to fix boot hangs, to improve power savings and to improve performance
In Linux-3.9 we removed the mwait_idle() loop:
69fb3676df33 ("x86 idle: remove mwait_idle() and "idle=mwait" cmdline param")
The reasoning was that modern machines should be sufficiently
happy during the boot process using the default_idle() HALT
loop, until cpuidle loads and either acpi_idle or intel_idle
invoke the newer MWAIT-with-hints idle loop.
But two machines reported problems:
1. Certain Core2-era machines support MWAIT-C1 and HALT only.
MWAIT-C1 is preferred for optimal power and performance.
But if they support just C1, cpuidle never loads and
so they use the boot-time default idle loop forever.
2. Some laptops will boot-hang if HALT is used,
but will boot successfully if MWAIT is used.
This appears to be a hidden assumption in BIOS SMI,
that is presumably valid on the proprietary OS
where the BIOS was validated.
https://bugzilla.kernel.org/show_bug.cgi?id=60770
So here we effectively revert the patch above, restoring
the mwait_idle() loop. However, we don't bother restoring
the idle=mwait cmdline parameter, since it appears to add
no value.
Maintainer notes:
For 3.9, simply revert 69fb3676df
for 3.10, patch -F3 applies, fuzz needed due to __cpuinit use in
context For 3.11, 3.12, 3.13, this patch applies cleanly
Tested-by: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Mike Galbraith <bitbucket@online.de>
Cc: <stable@vger.kernel.org> # 3.9+
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ian Malone <ibmalone@gmail.com>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/345254a551eb5a6a866e048d7ab570fd2193aca4.1389763084.git.len.brown@intel.com
[ Ported to recent kernels. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-15 00:37:34 -05:00
|
|
|
{
|
2025-04-03 14:50:45 +02:00
|
|
|
|
|
|
|
asm volatile("sti; mwait" :: "a" (eax), "c" (ecx));
|
sched/idle/x86: Restore mwait_idle() to fix boot hangs, to improve power savings and to improve performance
In Linux-3.9 we removed the mwait_idle() loop:
69fb3676df33 ("x86 idle: remove mwait_idle() and "idle=mwait" cmdline param")
The reasoning was that modern machines should be sufficiently
happy during the boot process using the default_idle() HALT
loop, until cpuidle loads and either acpi_idle or intel_idle
invoke the newer MWAIT-with-hints idle loop.
But two machines reported problems:
1. Certain Core2-era machines support MWAIT-C1 and HALT only.
MWAIT-C1 is preferred for optimal power and performance.
But if they support just C1, cpuidle never loads and
so they use the boot-time default idle loop forever.
2. Some laptops will boot-hang if HALT is used,
but will boot successfully if MWAIT is used.
This appears to be a hidden assumption in BIOS SMI,
that is presumably valid on the proprietary OS
where the BIOS was validated.
https://bugzilla.kernel.org/show_bug.cgi?id=60770
So here we effectively revert the patch above, restoring
the mwait_idle() loop. However, we don't bother restoring
the idle=mwait cmdline parameter, since it appears to add
no value.
Maintainer notes:
For 3.9, simply revert 69fb3676df
for 3.10, patch -F3 applies, fuzz needed due to __cpuinit use in
context For 3.11, 3.12, 3.13, this patch applies cleanly
Tested-by: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Mike Galbraith <bitbucket@online.de>
Cc: <stable@vger.kernel.org> # 3.9+
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ian Malone <ibmalone@gmail.com>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/345254a551eb5a6a866e048d7ab570fd2193aca4.1389763084.git.len.brown@intel.com
[ Ported to recent kernels. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-15 00:37:34 -05:00
|
|
|
}
|
|
|
|
|
2013-12-12 15:08:36 +01:00
|
|
|
/*
|
|
|
|
* This uses new MONITOR/MWAIT instructions on P4 processors with PNI,
|
|
|
|
* which can obviate IPI to trigger checking of need_resched.
|
|
|
|
* We execute MONITOR against need_resched and enter optimized wait state
|
|
|
|
* through MWAIT. Whenever someone changes need_resched, we would be woken
|
|
|
|
* up from MWAIT (without an IPI).
|
|
|
|
*
|
|
|
|
* New with Core Duo processors, MWAIT can take some hints based on CPU
|
|
|
|
* capability.
|
|
|
|
*/
|
2025-04-03 09:30:49 +02:00
|
|
|
static __always_inline void mwait_idle_with_hints(u32 eax, u32 ecx)
|
2013-12-12 15:08:36 +01:00
|
|
|
{
|
2025-04-14 15:33:19 +02:00
|
|
|
if (need_resched())
|
|
|
|
return;
|
|
|
|
|
|
|
|
x86_idle_clear_cpu_buffers();
|
|
|
|
|
2016-07-18 11:41:10 -07:00
|
|
|
if (static_cpu_has_bug(X86_BUG_MONITOR) || !current_set_polling_and_test()) {
|
x86/idle: Remove MFENCEs for X86_BUG_CLFLUSH_MONITOR in mwait_idle_with_hints() and prefer_mwait_c1_over_halt()
The following commit, 12 years ago:
7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers")
added barriers around the CLFLUSH in mwait_idle_with_hints(), justified with:
... and add memory barriers around it since the documentation is explicit
that CLFLUSH is only ordered with respect to MFENCE.
This also triggered, 11 years ago, the same adjustment in:
f8e617f45829 ("sched/idle/x86: Optimize unnecessary mwait_idle() resched IPIs")
during development, although it failed to get the static_cpu_has_bug() treatment.
X86_BUG_CLFLUSH_MONITOR (a.k.a the AAI65 errata) is specific to Intel CPUs,
and the SDM currently states:
Executions of the CLFLUSH instruction are ordered with respect to each
other and with respect to writes, locked read-modify-write instructions,
and fence instructions[1].
With footnote 1 reading:
Earlier versions of this manual specified that executions of the CLFLUSH
instruction were ordered only by the MFENCE instruction. All processors
implementing the CLFLUSH instruction also order it relative to the other
operations enumerated above.
i.e. The SDM was incorrect at the time, and barriers should not have been
inserted. Double checking the original AAI65 errata (not available from
intel.com any more) shows no mention of barriers either.
Note: If this were a general codepath, the MFENCEs would be needed, because
AMD CPUs of the same vintage do sport otherwise-unordered CLFLUSHs.
Remove the unnecessary barriers. Furthermore, use a plain alternative(),
rather than static_cpu_has_bug() and/or no optimisation. The workaround
is a single instruction.
Use an explicit %rax pointer rather than a general memory operand, because
MONITOR takes the pointer implicitly in the same way.
[ mingo: Cleaned up the commit a bit. ]
Fixes: 7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20250402172458.1378112-1-andrew.cooper3@citrix.com
2025-04-02 18:24:58 +01:00
|
|
|
const void *addr = ¤t_thread_info()->flags;
|
2013-12-12 15:08:36 +01:00
|
|
|
|
x86/idle: Remove MFENCEs for X86_BUG_CLFLUSH_MONITOR in mwait_idle_with_hints() and prefer_mwait_c1_over_halt()
The following commit, 12 years ago:
7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers")
added barriers around the CLFLUSH in mwait_idle_with_hints(), justified with:
... and add memory barriers around it since the documentation is explicit
that CLFLUSH is only ordered with respect to MFENCE.
This also triggered, 11 years ago, the same adjustment in:
f8e617f45829 ("sched/idle/x86: Optimize unnecessary mwait_idle() resched IPIs")
during development, although it failed to get the static_cpu_has_bug() treatment.
X86_BUG_CLFLUSH_MONITOR (a.k.a the AAI65 errata) is specific to Intel CPUs,
and the SDM currently states:
Executions of the CLFLUSH instruction are ordered with respect to each
other and with respect to writes, locked read-modify-write instructions,
and fence instructions[1].
With footnote 1 reading:
Earlier versions of this manual specified that executions of the CLFLUSH
instruction were ordered only by the MFENCE instruction. All processors
implementing the CLFLUSH instruction also order it relative to the other
operations enumerated above.
i.e. The SDM was incorrect at the time, and barriers should not have been
inserted. Double checking the original AAI65 errata (not available from
intel.com any more) shows no mention of barriers either.
Note: If this were a general codepath, the MFENCEs would be needed, because
AMD CPUs of the same vintage do sport otherwise-unordered CLFLUSHs.
Remove the unnecessary barriers. Furthermore, use a plain alternative(),
rather than static_cpu_has_bug() and/or no optimisation. The workaround
is a single instruction.
Use an explicit %rax pointer rather than a general memory operand, because
MONITOR takes the pointer implicitly in the same way.
[ mingo: Cleaned up the commit a bit. ]
Fixes: 7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20250402172458.1378112-1-andrew.cooper3@citrix.com
2025-04-02 18:24:58 +01:00
|
|
|
alternative_input("", "clflush (%[addr])", X86_BUG_CLFLUSH_MONITOR, [addr] "a" (addr));
|
|
|
|
__monitor(addr, 0, 0);
|
2023-11-15 10:13:23 -05:00
|
|
|
|
2025-04-14 15:33:19 +02:00
|
|
|
if (need_resched())
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
if (ecx & 1) {
|
|
|
|
__mwait(eax, ecx);
|
|
|
|
} else {
|
|
|
|
__sti_mwait(eax, ecx);
|
|
|
|
raw_local_irq_disable();
|
2023-11-15 10:13:23 -05:00
|
|
|
}
|
2013-12-12 15:08:36 +01:00
|
|
|
}
|
2025-04-14 15:33:19 +02:00
|
|
|
|
|
|
|
out:
|
2013-11-20 12:22:37 +01:00
|
|
|
current_clr_polling();
|
2013-12-12 15:08:36 +01:00
|
|
|
}
|
|
|
|
|
2020-04-24 12:37:56 -07:00
|
|
|
/*
|
|
|
|
* Caller can specify whether to enter C0.1 (low latency, less
|
|
|
|
* power saving) or C0.2 state (saves more power, but longer wakeup
|
|
|
|
* latency). This may be overridden by the IA32_UMWAIT_CONTROL MSR
|
|
|
|
* which can force requests for C0.2 to be downgraded to C0.1.
|
|
|
|
*/
|
|
|
|
static inline void __tpause(u32 ecx, u32 edx, u32 eax)
|
|
|
|
{
|
2025-04-02 20:08:08 +02:00
|
|
|
/* "tpause %ecx" */
|
x86/idle: Remove .s output beautifying delimiters from simpler asm() templates
Delimiters in asm() templates such as ';', '\t' or '\n' are not
required syntactically, they were used historically in the Linux
kernel to prettify the compiler's .s output for people who were
looking at compiler generated .s output.
Most x86 developers these days are primarily looking at:
1) objdump --disassemble-all .o
2) perf top's live kernel function annotation and disassembler
feature that uses /dev/mem.
... because:
- this kind of assembler output is standardized regardless of
compiler used,
- it's generally less messy looking,
- it gives ground-truth instead of being some intermediate layer
in the toolchain that might or might not be the real deal,
- and on a live kernel it also sees through the kernel's various
layers of runtime patching code obfuscation facilities, also
known as: alternative-instructions, tracepoints and jump labels.
There are some cases where the .s output is the most useful
tool, such as alternatives() code generation, but other than
that these delimiters used in simple asm() statements mostly
add noise to the source code side, which isn't desirable for
assembly code that is fragile enough already.
Remove the delimiters for <asm/mwait.h>, which also happens to
make the GCC inliner's asm() instruction length heuristics
more accurate...
[ mingo: Wrote a new changelog to give historic context and
to give people a chance to object. :-) ]
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250402180827.3762-3-ubizjak@gmail.com
2025-04-02 20:08:07 +02:00
|
|
|
asm volatile(".byte 0x66, 0x0f, 0xae, 0xf1"
|
2025-04-02 20:08:08 +02:00
|
|
|
:: "c" (ecx), "d" (edx), "a" (eax));
|
2020-04-24 12:37:56 -07:00
|
|
|
}
|
|
|
|
|
2010-09-17 15:36:40 -07:00
|
|
|
#endif /* _ASM_X86_MWAIT_H */
|