License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 15:07:57 +01:00
|
|
|
// SPDX-License-Identifier: GPL-2.0
|
2005-04-16 15:20:36 -07:00
|
|
|
/*
|
2025-03-24 14:33:24 +01:00
|
|
|
* x86 CPU caches detection and configuration
|
2005-04-16 15:20:36 -07:00
|
|
|
*
|
2025-03-24 14:33:24 +01:00
|
|
|
* Previous changes
|
2025-04-11 09:04:01 +02:00
|
|
|
* - Venkatesh Pallipadi: Cache identification through CPUID(0x4)
|
2025-03-24 14:33:24 +01:00
|
|
|
* - Ashok Raj <ashok.raj@intel.com>: Work with CPU hotplug infrastructure
|
2025-04-11 09:04:01 +02:00
|
|
|
* - Andi Kleen / Andreas Herrmann: CPUID(0x4) emulation on AMD
|
2005-04-16 15:20:36 -07:00
|
|
|
*/
|
|
|
|
|
2015-03-04 12:00:16 +00:00
|
|
|
#include <linux/cacheinfo.h>
|
2005-04-16 15:20:36 -07:00
|
|
|
#include <linux/cpu.h>
|
2022-11-02 08:47:11 +01:00
|
|
|
#include <linux/cpuhotplug.h>
|
2022-11-02 08:47:09 +01:00
|
|
|
#include <linux/stop_machine.h>
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2025-04-14 09:32:04 +02:00
|
|
|
#include <asm/amd/nb.h>
|
2025-03-04 09:51:23 +01:00
|
|
|
#include <asm/cacheinfo.h>
|
|
|
|
#include <asm/cpufeature.h>
|
2025-05-08 17:02:31 +02:00
|
|
|
#include <asm/cpuid/api.h>
|
2022-11-02 08:47:03 +01:00
|
|
|
#include <asm/mtrr.h>
|
2025-03-04 09:51:23 +01:00
|
|
|
#include <asm/smp.h>
|
2022-11-02 08:47:03 +01:00
|
|
|
#include <asm/tlbflush.h>
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2018-05-13 11:29:07 +02:00
|
|
|
#include "cpu.h"
|
|
|
|
|
x86/cacheinfo: move shared cache map definitions
Patch series "cpumask: Fix invalid uniprocessor assumptions", v4.
On uniprocessor builds, it is currently assumed that any cpumask will
contain the single CPU: cpu0. This assumption is used to provide
optimised implementations.
The current assumption also appears to be wrong, by ignoring the fact that
users can provide empty cpumasks. This can result in bugs as explained in
[1] - for_each_cpu() will run one iteration of the loop even when passed
an empty cpumask.
This series introduces some basic tests, and updates the optimisations for
uniprocessor builds.
The x86 patch was written after the kernel test robot [2] ran into a
failed build. I have tried to list the files potentially affected by the
changes to cpumask.h, in an attempt to find any other cases that fail on
!SMP. I've gone through some of the files manually, and ran a few cross
builds, but nothing else popped up. I (build) checked about half of the
potientally affected files, but I do not have the resources to do them
all. I hope we can fix other issues if/when they pop up later.
[1] https://lore.kernel.org/all/20220530082552.46113-1-sander@svanheule.net/
[2] https://lore.kernel.org/all/202206060858.wA0FOzRy-lkp@intel.com/
This patch (of 5):
The maps to keep track of shared caches between CPUs on SMP systems are
declared in asm/smp.h, among them specifically cpu_llc_shared_map. These
maps are externally defined in cpu/smpboot.c. The latter is only compiled
on CONFIG_SMP=y, which means the declared extern symbols from asm/smp.h do
not have a corresponding definition on uniprocessor builds.
The inline cpu_llc_shared_mask() function from asm/smp.h refers to the map
declaration mentioned above. This function is referenced in cacheinfo.c
inside for_each_cpu() loop macros, to provide cpumask for the loop. On
uniprocessor builds, the symbol for the cpu_llc_shared_map does not exist.
However, the current implementation of for_each_cpu() also (wrongly)
ignores the provided mask.
By sheer luck, the compiler thus optimises out this unused reference to
cpu_llc_shared_map, and the linker therefore does not require the
cpu_llc_shared_mask to actually exist on uniprocessor builds. Only on SMP
bulids does smpboot.o exist to provide the required symbols.
To no longer rely on compiler optimisations for successful uniprocessor
builds, move the definitions of cpu_llc_shared_map and cpu_l2c_shared_map
from smpboot.c to cacheinfo.c.
Link: https://lkml.kernel.org/r/cover.1656777646.git.sander@svanheule.net
Link: https://lkml.kernel.org/r/e8167ddb570f56744a3dc12c2149a660a324d969.1656777646.git.sander@svanheule.net
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Marco Elver <elver@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-02 18:08:24 +02:00
|
|
|
/* Shared last level cache maps */
|
|
|
|
DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_llc_shared_map);
|
|
|
|
|
|
|
|
/* Shared L2 cache maps */
|
|
|
|
DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_l2c_shared_map);
|
|
|
|
|
2023-05-12 23:07:14 +02:00
|
|
|
static cpumask_var_t cpu_cacheinfo_mask;
|
|
|
|
|
2022-11-02 08:47:00 +01:00
|
|
|
/* Kernel controls MTRR and/or PAT MSRs. */
|
|
|
|
unsigned int memory_caching_control __ro_after_init;
|
|
|
|
|
2009-07-04 00:35:45 +01:00
|
|
|
enum _cache_type {
|
2025-03-24 14:33:24 +01:00
|
|
|
CTYPE_NULL = 0,
|
|
|
|
CTYPE_DATA = 1,
|
|
|
|
CTYPE_INST = 2,
|
|
|
|
CTYPE_UNIFIED = 3
|
2005-04-16 15:20:36 -07:00
|
|
|
};
|
|
|
|
|
|
|
|
union _cpuid4_leaf_eax {
|
|
|
|
struct {
|
2025-03-24 14:33:24 +01:00
|
|
|
enum _cache_type type :5;
|
|
|
|
unsigned int level :3;
|
|
|
|
unsigned int is_self_initializing :1;
|
|
|
|
unsigned int is_fully_associative :1;
|
|
|
|
unsigned int reserved :4;
|
|
|
|
unsigned int num_threads_sharing :12;
|
|
|
|
unsigned int num_cores_on_die :6;
|
2005-04-16 15:20:36 -07:00
|
|
|
} split;
|
|
|
|
u32 full;
|
|
|
|
};
|
|
|
|
|
|
|
|
union _cpuid4_leaf_ebx {
|
|
|
|
struct {
|
2025-03-24 14:33:24 +01:00
|
|
|
unsigned int coherency_line_size :12;
|
|
|
|
unsigned int physical_line_partition :10;
|
|
|
|
unsigned int ways_of_associativity :10;
|
2005-04-16 15:20:36 -07:00
|
|
|
} split;
|
|
|
|
u32 full;
|
|
|
|
};
|
|
|
|
|
|
|
|
union _cpuid4_leaf_ecx {
|
|
|
|
struct {
|
2025-03-24 14:33:24 +01:00
|
|
|
unsigned int number_of_sets :32;
|
2005-04-16 15:20:36 -07:00
|
|
|
} split;
|
|
|
|
u32 full;
|
|
|
|
};
|
|
|
|
|
2025-03-24 14:33:11 +01:00
|
|
|
struct _cpuid4_info {
|
2005-04-16 15:20:36 -07:00
|
|
|
union _cpuid4_leaf_eax eax;
|
|
|
|
union _cpuid4_leaf_ebx ebx;
|
|
|
|
union _cpuid4_leaf_ecx ecx;
|
2016-10-22 06:19:50 -07:00
|
|
|
unsigned int id;
|
2005-04-16 15:20:36 -07:00
|
|
|
unsigned long size;
|
2009-01-10 21:58:10 -08:00
|
|
|
};
|
|
|
|
|
2025-04-11 09:04:01 +02:00
|
|
|
/* Map CPUID(0x4) EAX.cache_type to <linux/cacheinfo.h> types */
|
2025-03-24 14:33:22 +01:00
|
|
|
static const enum cache_type cache_type_map[] = {
|
|
|
|
[CTYPE_NULL] = CACHE_TYPE_NOCACHE,
|
|
|
|
[CTYPE_DATA] = CACHE_TYPE_DATA,
|
|
|
|
[CTYPE_INST] = CACHE_TYPE_INST,
|
|
|
|
[CTYPE_UNIFIED] = CACHE_TYPE_UNIFIED,
|
|
|
|
};
|
|
|
|
|
2025-03-24 14:33:10 +01:00
|
|
|
/*
|
2025-04-11 09:04:01 +02:00
|
|
|
* Fallback AMD CPUID(0x4) emulation
|
2025-03-24 14:33:10 +01:00
|
|
|
* AMD CPUs with TOPOEXT can just use CPUID(0x8000001d)
|
x86/cacheinfo: Properly parse CPUID(0x80000006) L2/L3 associativity
Complete the AMD CPUID(4) emulation logic, which uses CPUID(0x80000006)
for L2/L3 cache info and an assocs[] associativity mapping array, by
adding entries for 3-way caches and 6-way caches.
Properly handle the case where CPUID(0x80000006) returns an L2/L3
associativity of 9. This is not real associativity, but a marker to
indicate that the respective L2/L3 cache information should be retrieved
from CPUID(0x8000001d) instead. If such a marker is encountered, return
early from legacy_amd_cpuid4(), thus effectively emulating an "invalid
index" CPUID(4) response with a cache type of zero.
When checking if CPUID(0x80000006) L2/L3 cache info output is valid, and
given the associtivity marker 9 above, do not just check if the whole
ECX/EDX register is zero. Rather, check if the associativity is zero or
9. An associativity of zero implies no L2/L3 cache, which make it the
more correct check anyway vs. a zero check of the whole output register.
Fixes: a326e948c538 ("x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors")
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250409122233.1058601-3-darwi@linutronix.de
2025-04-09 14:22:31 +02:00
|
|
|
*
|
|
|
|
* @AMD_L2_L3_INVALID_ASSOC: cache info for the respective L2/L3 cache should
|
|
|
|
* be determined from CPUID(0x8000001d) instead of CPUID(0x80000006).
|
2025-03-24 14:33:10 +01:00
|
|
|
*/
|
2006-06-26 13:56:13 +02:00
|
|
|
|
x86/cacheinfo: Properly parse CPUID(0x80000005) L1d/L1i associativity
For the AMD CPUID(4) emulation cache info logic, the same associativity
mapping array, assocs[], is used for both CPUID(0x80000005) and
CPUID(0x80000006).
This is incorrect since per the AMD manuals, the mappings for
CPUID(0x80000005) L1d/L1i associativity is:
n = 0x1 -> 0xfe n
n = 0xff fully associative
while assocs[] maps these values to:
n = 0x1, 0x2, 0x4 n
n = 0x3, 0x7, 0x9 0
n = 0x6 8
n = 0x8 16
n = 0xa 32
n = 0xb 48
n = 0xc 64
n = 0xd 96
n = 0xe 128
n = 0xf fully associative
which is only valid for CPUID(0x80000006).
Parse CPUID(0x80000005) L1d/L1i associativity values as shown in the AMD
manuals. Since the 0xffff literal is used to denote full associativity
at the AMD CPUID(4)-emulation logic, define AMD_CPUID4_FULLY_ASSOCIATIVE
for it instead of spreading that literal in more places.
Mark the assocs[] mapping array as only valid for CPUID(0x80000006) L2/L3
cache information.
Fixes: a326e948c538 ("x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors")
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250409122233.1058601-2-darwi@linutronix.de
2025-04-09 14:22:30 +02:00
|
|
|
#define AMD_CPUID4_FULLY_ASSOCIATIVE 0xffff
|
x86/cacheinfo: Properly parse CPUID(0x80000006) L2/L3 associativity
Complete the AMD CPUID(4) emulation logic, which uses CPUID(0x80000006)
for L2/L3 cache info and an assocs[] associativity mapping array, by
adding entries for 3-way caches and 6-way caches.
Properly handle the case where CPUID(0x80000006) returns an L2/L3
associativity of 9. This is not real associativity, but a marker to
indicate that the respective L2/L3 cache information should be retrieved
from CPUID(0x8000001d) instead. If such a marker is encountered, return
early from legacy_amd_cpuid4(), thus effectively emulating an "invalid
index" CPUID(4) response with a cache type of zero.
When checking if CPUID(0x80000006) L2/L3 cache info output is valid, and
given the associtivity marker 9 above, do not just check if the whole
ECX/EDX register is zero. Rather, check if the associativity is zero or
9. An associativity of zero implies no L2/L3 cache, which make it the
more correct check anyway vs. a zero check of the whole output register.
Fixes: a326e948c538 ("x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors")
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250409122233.1058601-3-darwi@linutronix.de
2025-04-09 14:22:31 +02:00
|
|
|
#define AMD_L2_L3_INVALID_ASSOC 0x9
|
x86/cacheinfo: Properly parse CPUID(0x80000005) L1d/L1i associativity
For the AMD CPUID(4) emulation cache info logic, the same associativity
mapping array, assocs[], is used for both CPUID(0x80000005) and
CPUID(0x80000006).
This is incorrect since per the AMD manuals, the mappings for
CPUID(0x80000005) L1d/L1i associativity is:
n = 0x1 -> 0xfe n
n = 0xff fully associative
while assocs[] maps these values to:
n = 0x1, 0x2, 0x4 n
n = 0x3, 0x7, 0x9 0
n = 0x6 8
n = 0x8 16
n = 0xa 32
n = 0xb 48
n = 0xc 64
n = 0xd 96
n = 0xe 128
n = 0xf fully associative
which is only valid for CPUID(0x80000006).
Parse CPUID(0x80000005) L1d/L1i associativity values as shown in the AMD
manuals. Since the 0xffff literal is used to denote full associativity
at the AMD CPUID(4)-emulation logic, define AMD_CPUID4_FULLY_ASSOCIATIVE
for it instead of spreading that literal in more places.
Mark the assocs[] mapping array as only valid for CPUID(0x80000006) L2/L3
cache information.
Fixes: a326e948c538 ("x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors")
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250409122233.1058601-2-darwi@linutronix.de
2025-04-09 14:22:30 +02:00
|
|
|
|
2006-06-26 13:56:13 +02:00
|
|
|
union l1_cache {
|
|
|
|
struct {
|
2025-03-24 14:33:24 +01:00
|
|
|
unsigned line_size :8;
|
|
|
|
unsigned lines_per_tag :8;
|
|
|
|
unsigned assoc :8;
|
|
|
|
unsigned size_in_kb :8;
|
2006-06-26 13:56:13 +02:00
|
|
|
};
|
2025-03-24 14:33:24 +01:00
|
|
|
unsigned int val;
|
2006-06-26 13:56:13 +02:00
|
|
|
};
|
|
|
|
|
|
|
|
union l2_cache {
|
|
|
|
struct {
|
2025-03-24 14:33:24 +01:00
|
|
|
unsigned line_size :8;
|
|
|
|
unsigned lines_per_tag :4;
|
|
|
|
unsigned assoc :4;
|
|
|
|
unsigned size_in_kb :16;
|
2006-06-26 13:56:13 +02:00
|
|
|
};
|
2025-03-24 14:33:24 +01:00
|
|
|
unsigned int val;
|
2006-06-26 13:56:13 +02:00
|
|
|
};
|
|
|
|
|
2007-07-21 17:10:03 +02:00
|
|
|
union l3_cache {
|
|
|
|
struct {
|
2025-03-24 14:33:24 +01:00
|
|
|
unsigned line_size :8;
|
|
|
|
unsigned lines_per_tag :4;
|
|
|
|
unsigned assoc :4;
|
|
|
|
unsigned res :2;
|
|
|
|
unsigned size_encoded :14;
|
2007-07-21 17:10:03 +02:00
|
|
|
};
|
2025-03-24 14:33:24 +01:00
|
|
|
unsigned int val;
|
2007-07-21 17:10:03 +02:00
|
|
|
};
|
|
|
|
|
x86/cacheinfo: Properly parse CPUID(0x80000005) L1d/L1i associativity
For the AMD CPUID(4) emulation cache info logic, the same associativity
mapping array, assocs[], is used for both CPUID(0x80000005) and
CPUID(0x80000006).
This is incorrect since per the AMD manuals, the mappings for
CPUID(0x80000005) L1d/L1i associativity is:
n = 0x1 -> 0xfe n
n = 0xff fully associative
while assocs[] maps these values to:
n = 0x1, 0x2, 0x4 n
n = 0x3, 0x7, 0x9 0
n = 0x6 8
n = 0x8 16
n = 0xa 32
n = 0xb 48
n = 0xc 64
n = 0xd 96
n = 0xe 128
n = 0xf fully associative
which is only valid for CPUID(0x80000006).
Parse CPUID(0x80000005) L1d/L1i associativity values as shown in the AMD
manuals. Since the 0xffff literal is used to denote full associativity
at the AMD CPUID(4)-emulation logic, define AMD_CPUID4_FULLY_ASSOCIATIVE
for it instead of spreading that literal in more places.
Mark the assocs[] mapping array as only valid for CPUID(0x80000006) L2/L3
cache information.
Fixes: a326e948c538 ("x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors")
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250409122233.1058601-2-darwi@linutronix.de
2025-04-09 14:22:30 +02:00
|
|
|
/* L2/L3 associativity mapping */
|
x86: delete __cpuinit usage from all x86 files
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.
After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.
Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings. In any case, they are temporary and harmless.
This removes all the arch/x86 uses of the __cpuinit macros from
all C files. x86 only had the one __CPUINIT used in assembly files,
and it wasn't paired off with a .previous or a __FINIT, so we can
delete it directly w/o any corresponding additional change there.
[1] https://lkml.org/lkml/2013/5/20/589
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-06-18 18:23:59 -04:00
|
|
|
static const unsigned short assocs[] = {
|
2025-03-24 14:33:24 +01:00
|
|
|
[1] = 1,
|
|
|
|
[2] = 2,
|
x86/cacheinfo: Properly parse CPUID(0x80000006) L2/L3 associativity
Complete the AMD CPUID(4) emulation logic, which uses CPUID(0x80000006)
for L2/L3 cache info and an assocs[] associativity mapping array, by
adding entries for 3-way caches and 6-way caches.
Properly handle the case where CPUID(0x80000006) returns an L2/L3
associativity of 9. This is not real associativity, but a marker to
indicate that the respective L2/L3 cache information should be retrieved
from CPUID(0x8000001d) instead. If such a marker is encountered, return
early from legacy_amd_cpuid4(), thus effectively emulating an "invalid
index" CPUID(4) response with a cache type of zero.
When checking if CPUID(0x80000006) L2/L3 cache info output is valid, and
given the associtivity marker 9 above, do not just check if the whole
ECX/EDX register is zero. Rather, check if the associativity is zero or
9. An associativity of zero implies no L2/L3 cache, which make it the
more correct check anyway vs. a zero check of the whole output register.
Fixes: a326e948c538 ("x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors")
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250409122233.1058601-3-darwi@linutronix.de
2025-04-09 14:22:31 +02:00
|
|
|
[3] = 3,
|
2025-03-24 14:33:24 +01:00
|
|
|
[4] = 4,
|
x86/cacheinfo: Properly parse CPUID(0x80000006) L2/L3 associativity
Complete the AMD CPUID(4) emulation logic, which uses CPUID(0x80000006)
for L2/L3 cache info and an assocs[] associativity mapping array, by
adding entries for 3-way caches and 6-way caches.
Properly handle the case where CPUID(0x80000006) returns an L2/L3
associativity of 9. This is not real associativity, but a marker to
indicate that the respective L2/L3 cache information should be retrieved
from CPUID(0x8000001d) instead. If such a marker is encountered, return
early from legacy_amd_cpuid4(), thus effectively emulating an "invalid
index" CPUID(4) response with a cache type of zero.
When checking if CPUID(0x80000006) L2/L3 cache info output is valid, and
given the associtivity marker 9 above, do not just check if the whole
ECX/EDX register is zero. Rather, check if the associativity is zero or
9. An associativity of zero implies no L2/L3 cache, which make it the
more correct check anyway vs. a zero check of the whole output register.
Fixes: a326e948c538 ("x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors")
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250409122233.1058601-3-darwi@linutronix.de
2025-04-09 14:22:31 +02:00
|
|
|
[5] = 6,
|
2025-03-24 14:33:24 +01:00
|
|
|
[6] = 8,
|
|
|
|
[8] = 16,
|
|
|
|
[0xa] = 32,
|
|
|
|
[0xb] = 48,
|
|
|
|
[0xc] = 64,
|
|
|
|
[0xd] = 96,
|
|
|
|
[0xe] = 128,
|
x86/cacheinfo: Properly parse CPUID(0x80000005) L1d/L1i associativity
For the AMD CPUID(4) emulation cache info logic, the same associativity
mapping array, assocs[], is used for both CPUID(0x80000005) and
CPUID(0x80000006).
This is incorrect since per the AMD manuals, the mappings for
CPUID(0x80000005) L1d/L1i associativity is:
n = 0x1 -> 0xfe n
n = 0xff fully associative
while assocs[] maps these values to:
n = 0x1, 0x2, 0x4 n
n = 0x3, 0x7, 0x9 0
n = 0x6 8
n = 0x8 16
n = 0xa 32
n = 0xb 48
n = 0xc 64
n = 0xd 96
n = 0xe 128
n = 0xf fully associative
which is only valid for CPUID(0x80000006).
Parse CPUID(0x80000005) L1d/L1i associativity values as shown in the AMD
manuals. Since the 0xffff literal is used to denote full associativity
at the AMD CPUID(4)-emulation logic, define AMD_CPUID4_FULLY_ASSOCIATIVE
for it instead of spreading that literal in more places.
Mark the assocs[] mapping array as only valid for CPUID(0x80000006) L2/L3
cache information.
Fixes: a326e948c538 ("x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors")
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250409122233.1058601-2-darwi@linutronix.de
2025-04-09 14:22:30 +02:00
|
|
|
[0xf] = AMD_CPUID4_FULLY_ASSOCIATIVE
|
2007-07-21 17:10:03 +02:00
|
|
|
};
|
|
|
|
|
x86: delete __cpuinit usage from all x86 files
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.
After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.
Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings. In any case, they are temporary and harmless.
This removes all the arch/x86 uses of the __cpuinit macros from
all C files. x86 only had the one __CPUINIT used in assembly files,
and it wasn't paired off with a .previous or a __FINIT, so we can
delete it directly w/o any corresponding additional change there.
[1] https://lkml.org/lkml/2013/5/20/589
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-06-18 18:23:59 -04:00
|
|
|
static const unsigned char levels[] = { 1, 1, 2, 3 };
|
2025-03-24 14:33:24 +01:00
|
|
|
static const unsigned char types[] = { 1, 2, 3, 3 };
|
2006-06-26 13:56:13 +02:00
|
|
|
|
2025-03-24 14:33:10 +01:00
|
|
|
static void legacy_amd_cpuid4(int index, union _cpuid4_leaf_eax *eax,
|
|
|
|
union _cpuid4_leaf_ebx *ebx, union _cpuid4_leaf_ecx *ecx)
|
2006-06-26 13:56:13 +02:00
|
|
|
{
|
2025-03-24 14:33:01 +01:00
|
|
|
unsigned int dummy, line_size, lines_per_tag, assoc, size_in_kb;
|
2025-03-24 14:33:24 +01:00
|
|
|
union l1_cache l1i, l1d, *l1;
|
2006-06-26 13:56:13 +02:00
|
|
|
union l2_cache l2;
|
2007-07-21 17:10:03 +02:00
|
|
|
union l3_cache l3;
|
2006-06-26 13:56:13 +02:00
|
|
|
|
|
|
|
eax->full = 0;
|
|
|
|
ebx->full = 0;
|
|
|
|
ecx->full = 0;
|
|
|
|
|
|
|
|
cpuid(0x80000005, &dummy, &dummy, &l1d.val, &l1i.val);
|
2007-07-21 17:10:03 +02:00
|
|
|
cpuid(0x80000006, &dummy, &dummy, &l2.val, &l3.val);
|
2006-06-26 13:56:13 +02:00
|
|
|
|
2025-03-24 14:33:24 +01:00
|
|
|
l1 = &l1d;
|
2025-03-24 14:33:01 +01:00
|
|
|
switch (index) {
|
2007-07-21 17:10:03 +02:00
|
|
|
case 1:
|
|
|
|
l1 = &l1i;
|
2020-08-23 17:36:59 -05:00
|
|
|
fallthrough;
|
2007-07-21 17:10:03 +02:00
|
|
|
case 0:
|
|
|
|
if (!l1->val)
|
|
|
|
return;
|
2025-03-24 14:33:24 +01:00
|
|
|
|
x86/cacheinfo: Properly parse CPUID(0x80000005) L1d/L1i associativity
For the AMD CPUID(4) emulation cache info logic, the same associativity
mapping array, assocs[], is used for both CPUID(0x80000005) and
CPUID(0x80000006).
This is incorrect since per the AMD manuals, the mappings for
CPUID(0x80000005) L1d/L1i associativity is:
n = 0x1 -> 0xfe n
n = 0xff fully associative
while assocs[] maps these values to:
n = 0x1, 0x2, 0x4 n
n = 0x3, 0x7, 0x9 0
n = 0x6 8
n = 0x8 16
n = 0xa 32
n = 0xb 48
n = 0xc 64
n = 0xd 96
n = 0xe 128
n = 0xf fully associative
which is only valid for CPUID(0x80000006).
Parse CPUID(0x80000005) L1d/L1i associativity values as shown in the AMD
manuals. Since the 0xffff literal is used to denote full associativity
at the AMD CPUID(4)-emulation logic, define AMD_CPUID4_FULLY_ASSOCIATIVE
for it instead of spreading that literal in more places.
Mark the assocs[] mapping array as only valid for CPUID(0x80000006) L2/L3
cache information.
Fixes: a326e948c538 ("x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors")
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250409122233.1058601-2-darwi@linutronix.de
2025-04-09 14:22:30 +02:00
|
|
|
assoc = (l1->assoc == 0xff) ? AMD_CPUID4_FULLY_ASSOCIATIVE : l1->assoc;
|
2025-03-24 14:33:24 +01:00
|
|
|
line_size = l1->line_size;
|
|
|
|
lines_per_tag = l1->lines_per_tag;
|
|
|
|
size_in_kb = l1->size_in_kb;
|
2007-07-21 17:10:03 +02:00
|
|
|
break;
|
|
|
|
case 2:
|
x86/cacheinfo: Properly parse CPUID(0x80000006) L2/L3 associativity
Complete the AMD CPUID(4) emulation logic, which uses CPUID(0x80000006)
for L2/L3 cache info and an assocs[] associativity mapping array, by
adding entries for 3-way caches and 6-way caches.
Properly handle the case where CPUID(0x80000006) returns an L2/L3
associativity of 9. This is not real associativity, but a marker to
indicate that the respective L2/L3 cache information should be retrieved
from CPUID(0x8000001d) instead. If such a marker is encountered, return
early from legacy_amd_cpuid4(), thus effectively emulating an "invalid
index" CPUID(4) response with a cache type of zero.
When checking if CPUID(0x80000006) L2/L3 cache info output is valid, and
given the associtivity marker 9 above, do not just check if the whole
ECX/EDX register is zero. Rather, check if the associativity is zero or
9. An associativity of zero implies no L2/L3 cache, which make it the
more correct check anyway vs. a zero check of the whole output register.
Fixes: a326e948c538 ("x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors")
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250409122233.1058601-3-darwi@linutronix.de
2025-04-09 14:22:31 +02:00
|
|
|
if (!l2.assoc || l2.assoc == AMD_L2_L3_INVALID_ASSOC)
|
2007-07-21 17:10:03 +02:00
|
|
|
return;
|
2025-03-24 14:33:24 +01:00
|
|
|
|
|
|
|
/* Use x86_cache_size as it might have K7 errata fixes */
|
|
|
|
assoc = assocs[l2.assoc];
|
|
|
|
line_size = l2.line_size;
|
|
|
|
lines_per_tag = l2.lines_per_tag;
|
|
|
|
size_in_kb = __this_cpu_read(cpu_info.x86_cache_size);
|
2007-07-21 17:10:03 +02:00
|
|
|
break;
|
|
|
|
case 3:
|
x86/cacheinfo: Properly parse CPUID(0x80000006) L2/L3 associativity
Complete the AMD CPUID(4) emulation logic, which uses CPUID(0x80000006)
for L2/L3 cache info and an assocs[] associativity mapping array, by
adding entries for 3-way caches and 6-way caches.
Properly handle the case where CPUID(0x80000006) returns an L2/L3
associativity of 9. This is not real associativity, but a marker to
indicate that the respective L2/L3 cache information should be retrieved
from CPUID(0x8000001d) instead. If such a marker is encountered, return
early from legacy_amd_cpuid4(), thus effectively emulating an "invalid
index" CPUID(4) response with a cache type of zero.
When checking if CPUID(0x80000006) L2/L3 cache info output is valid, and
given the associtivity marker 9 above, do not just check if the whole
ECX/EDX register is zero. Rather, check if the associativity is zero or
9. An associativity of zero implies no L2/L3 cache, which make it the
more correct check anyway vs. a zero check of the whole output register.
Fixes: a326e948c538 ("x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors")
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250409122233.1058601-3-darwi@linutronix.de
2025-04-09 14:22:31 +02:00
|
|
|
if (!l3.assoc || l3.assoc == AMD_L2_L3_INVALID_ASSOC)
|
2007-07-21 17:10:03 +02:00
|
|
|
return;
|
2025-03-24 14:33:24 +01:00
|
|
|
|
|
|
|
assoc = assocs[l3.assoc];
|
|
|
|
line_size = l3.line_size;
|
|
|
|
lines_per_tag = l3.lines_per_tag;
|
|
|
|
size_in_kb = l3.size_encoded * 512;
|
2009-09-03 09:41:19 +02:00
|
|
|
if (boot_cpu_has(X86_FEATURE_AMD_DCM)) {
|
2025-03-24 14:33:24 +01:00
|
|
|
size_in_kb = size_in_kb >> 1;
|
|
|
|
assoc = assoc >> 1;
|
2009-09-03 09:41:19 +02:00
|
|
|
}
|
2007-07-21 17:10:03 +02:00
|
|
|
break;
|
|
|
|
default:
|
|
|
|
return;
|
2006-06-26 13:56:13 +02:00
|
|
|
}
|
|
|
|
|
2025-03-24 14:33:24 +01:00
|
|
|
eax->split.is_self_initializing = 1;
|
|
|
|
eax->split.type = types[index];
|
|
|
|
eax->split.level = levels[index];
|
|
|
|
eax->split.num_threads_sharing = 0;
|
|
|
|
eax->split.num_cores_on_die = topology_num_cores_per_package();
|
2007-07-21 17:10:03 +02:00
|
|
|
|
x86/cacheinfo: Properly parse CPUID(0x80000005) L1d/L1i associativity
For the AMD CPUID(4) emulation cache info logic, the same associativity
mapping array, assocs[], is used for both CPUID(0x80000005) and
CPUID(0x80000006).
This is incorrect since per the AMD manuals, the mappings for
CPUID(0x80000005) L1d/L1i associativity is:
n = 0x1 -> 0xfe n
n = 0xff fully associative
while assocs[] maps these values to:
n = 0x1, 0x2, 0x4 n
n = 0x3, 0x7, 0x9 0
n = 0x6 8
n = 0x8 16
n = 0xa 32
n = 0xb 48
n = 0xc 64
n = 0xd 96
n = 0xe 128
n = 0xf fully associative
which is only valid for CPUID(0x80000006).
Parse CPUID(0x80000005) L1d/L1i associativity values as shown in the AMD
manuals. Since the 0xffff literal is used to denote full associativity
at the AMD CPUID(4)-emulation logic, define AMD_CPUID4_FULLY_ASSOCIATIVE
for it instead of spreading that literal in more places.
Mark the assocs[] mapping array as only valid for CPUID(0x80000006) L2/L3
cache information.
Fixes: a326e948c538 ("x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors")
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250409122233.1058601-2-darwi@linutronix.de
2025-04-09 14:22:30 +02:00
|
|
|
if (assoc == AMD_CPUID4_FULLY_ASSOCIATIVE)
|
2006-06-26 13:56:13 +02:00
|
|
|
eax->split.is_fully_associative = 1;
|
2025-03-24 14:33:24 +01:00
|
|
|
|
|
|
|
ebx->split.coherency_line_size = line_size - 1;
|
|
|
|
ebx->split.ways_of_associativity = assoc - 1;
|
|
|
|
ebx->split.physical_line_partition = lines_per_tag - 1;
|
|
|
|
ecx->split.number_of_sets = (size_in_kb * 1024) / line_size /
|
2006-06-26 13:56:13 +02:00
|
|
|
(ebx->split.ways_of_associativity + 1) - 1;
|
|
|
|
}
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2025-03-24 14:33:11 +01:00
|
|
|
static int cpuid4_info_fill_done(struct _cpuid4_info *id4, union _cpuid4_leaf_eax eax,
|
2025-03-24 14:33:10 +01:00
|
|
|
union _cpuid4_leaf_ebx ebx, union _cpuid4_leaf_ecx ecx)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
2015-03-04 12:00:16 +00:00
|
|
|
if (eax.split.type == CTYPE_NULL)
|
2025-03-24 14:33:06 +01:00
|
|
|
return -EIO;
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2025-03-24 14:33:05 +01:00
|
|
|
id4->eax = eax;
|
|
|
|
id4->ebx = ebx;
|
|
|
|
id4->ecx = ecx;
|
|
|
|
id4->size = (ecx.split.number_of_sets + 1) *
|
|
|
|
(ebx.split.coherency_line_size + 1) *
|
|
|
|
(ebx.split.physical_line_partition + 1) *
|
|
|
|
(ebx.split.ways_of_associativity + 1);
|
|
|
|
|
2005-04-16 15:20:36 -07:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2025-03-24 14:33:11 +01:00
|
|
|
static int amd_fill_cpuid4_info(int index, struct _cpuid4_info *id4)
|
2025-03-24 14:33:10 +01:00
|
|
|
{
|
|
|
|
union _cpuid4_leaf_eax eax;
|
|
|
|
union _cpuid4_leaf_ebx ebx;
|
|
|
|
union _cpuid4_leaf_ecx ecx;
|
|
|
|
u32 ignored;
|
|
|
|
|
|
|
|
if (boot_cpu_has(X86_FEATURE_TOPOEXT) || boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
|
|
|
|
cpuid_count(0x8000001d, index, &eax.full, &ebx.full, &ecx.full, &ignored);
|
|
|
|
else
|
|
|
|
legacy_amd_cpuid4(index, &eax, &ebx, &ecx);
|
|
|
|
|
|
|
|
return cpuid4_info_fill_done(id4, eax, ebx, ecx);
|
|
|
|
}
|
|
|
|
|
2025-03-24 14:33:11 +01:00
|
|
|
static int intel_fill_cpuid4_info(int index, struct _cpuid4_info *id4)
|
2025-03-24 14:33:10 +01:00
|
|
|
{
|
|
|
|
union _cpuid4_leaf_eax eax;
|
|
|
|
union _cpuid4_leaf_ebx ebx;
|
|
|
|
union _cpuid4_leaf_ecx ecx;
|
|
|
|
u32 ignored;
|
|
|
|
|
|
|
|
cpuid_count(4, index, &eax.full, &ebx.full, &ecx.full, &ignored);
|
|
|
|
|
|
|
|
return cpuid4_info_fill_done(id4, eax, ebx, ecx);
|
|
|
|
}
|
|
|
|
|
2025-03-24 14:33:11 +01:00
|
|
|
static int fill_cpuid4_info(int index, struct _cpuid4_info *id4)
|
2025-03-24 14:33:10 +01:00
|
|
|
{
|
|
|
|
u8 cpu_vendor = boot_cpu_data.x86_vendor;
|
|
|
|
|
|
|
|
return (cpu_vendor == X86_VENDOR_AMD || cpu_vendor == X86_VENDOR_HYGON) ?
|
|
|
|
amd_fill_cpuid4_info(index, id4) :
|
|
|
|
intel_fill_cpuid4_info(index, id4);
|
|
|
|
}
|
|
|
|
|
x86: delete __cpuinit usage from all x86 files
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.
After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.
Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings. In any case, they are temporary and harmless.
This removes all the arch/x86 uses of the __cpuinit macros from
all C files. x86 only had the one __CPUINIT used in assembly files,
and it wasn't paired off with a .previous or a __FINIT, so we can
delete it directly w/o any corresponding additional change there.
[1] https://lkml.org/lkml/2013/5/20/589
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-06-18 18:23:59 -04:00
|
|
|
static int find_num_cache_leaves(struct cpuinfo_x86 *c)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
2025-03-24 14:33:24 +01:00
|
|
|
unsigned int eax, ebx, ecx, edx, op;
|
|
|
|
union _cpuid4_leaf_eax cache_eax;
|
|
|
|
int i = -1;
|
2012-10-19 10:59:33 +02:00
|
|
|
|
2025-03-24 14:33:24 +01:00
|
|
|
/* Do a CPUID(op) loop to calculate num_cache_leaves */
|
|
|
|
op = (c->x86_vendor == X86_VENDOR_AMD || c->x86_vendor == X86_VENDOR_HYGON) ? 0x8000001d : 4;
|
2005-10-30 14:59:30 -08:00
|
|
|
do {
|
|
|
|
++i;
|
2012-10-19 10:59:33 +02:00
|
|
|
cpuid_count(op, i, &eax, &ebx, &ecx, &edx);
|
2005-04-16 15:20:36 -07:00
|
|
|
cache_eax.full = eax;
|
2015-03-04 12:00:16 +00:00
|
|
|
} while (cache_eax.split.type != CTYPE_NULL);
|
2005-10-30 14:59:30 -08:00
|
|
|
return i;
|
2005-04-16 15:20:36 -07:00
|
|
|
}
|
|
|
|
|
2025-03-24 14:33:23 +01:00
|
|
|
/*
|
|
|
|
* AMD/Hygon CPUs may have multiple LLCs if L3 caches exist.
|
|
|
|
*/
|
|
|
|
|
2024-02-13 22:04:11 +01:00
|
|
|
void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, u16 die_id)
|
2018-04-27 16:34:37 -05:00
|
|
|
{
|
2025-03-24 14:33:23 +01:00
|
|
|
if (!cpuid_amd_hygon_has_l3_cache())
|
2018-04-27 16:34:37 -05:00
|
|
|
return;
|
|
|
|
|
|
|
|
if (c->x86 < 0x17) {
|
2025-03-24 14:33:23 +01:00
|
|
|
/* Pre-Zen: LLC is at the node level */
|
2024-02-13 22:04:11 +01:00
|
|
|
c->topo.llc_id = die_id;
|
2019-06-19 10:32:53 -04:00
|
|
|
} else if (c->x86 == 0x17 && c->x86_model <= 0x1F) {
|
2018-04-27 16:34:37 -05:00
|
|
|
/*
|
2025-03-24 14:33:23 +01:00
|
|
|
* Family 17h up to 1F models: LLC is at the core
|
|
|
|
* complex level. Core complex ID is ApicId[3].
|
2018-04-27 16:34:37 -05:00
|
|
|
*/
|
2023-08-14 10:18:38 +02:00
|
|
|
c->topo.llc_id = c->topo.apicid >> 3;
|
2018-04-27 16:34:37 -05:00
|
|
|
} else {
|
|
|
|
/*
|
2025-03-24 14:33:23 +01:00
|
|
|
* Newer families: LLC ID is calculated from the number
|
|
|
|
* of threads sharing the L3 cache.
|
|
|
|
*/
|
2018-04-27 16:34:37 -05:00
|
|
|
u32 eax, ebx, ecx, edx, num_sharing_cache = 0;
|
|
|
|
u32 llc_index = find_num_cache_leaves(c) - 1;
|
|
|
|
|
|
|
|
cpuid_count(0x8000001d, llc_index, &eax, &ebx, &ecx, &edx);
|
|
|
|
if (eax)
|
|
|
|
num_sharing_cache = ((eax >> 14) & 0xfff) + 1;
|
|
|
|
|
|
|
|
if (num_sharing_cache) {
|
2025-03-24 14:33:24 +01:00
|
|
|
int index_msb = get_count_order(num_sharing_cache);
|
2018-04-27 16:34:37 -05:00
|
|
|
|
2025-03-24 14:33:24 +01:00
|
|
|
c->topo.llc_id = c->topo.apicid >> index_msb;
|
2018-04-27 16:34:37 -05:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2023-08-14 10:18:38 +02:00
|
|
|
void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c)
|
2018-09-23 17:33:44 +08:00
|
|
|
{
|
2025-03-24 14:33:23 +01:00
|
|
|
if (!cpuid_amd_hygon_has_l3_cache())
|
2018-09-23 17:33:44 +08:00
|
|
|
return;
|
|
|
|
|
|
|
|
/*
|
2025-03-24 14:33:23 +01:00
|
|
|
* Hygons are similar to AMD Family 17h up to 1F models: LLC is
|
|
|
|
* at the core complex level. Core complex ID is ApicId[3].
|
2018-09-23 17:33:44 +08:00
|
|
|
*/
|
2023-08-14 10:18:38 +02:00
|
|
|
c->topo.llc_id = c->topo.apicid >> 3;
|
2018-09-23 17:33:44 +08:00
|
|
|
}
|
|
|
|
|
x86: delete __cpuinit usage from all x86 files
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.
After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.
Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings. In any case, they are temporary and harmless.
This removes all the arch/x86 uses of the __cpuinit macros from
all C files. x86 only had the one __CPUINIT used in assembly files,
and it wasn't paired off with a .previous or a __FINIT, so we can
delete it directly w/o any corresponding additional change there.
[1] https://lkml.org/lkml/2013/5/20/589
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-06-18 18:23:59 -04:00
|
|
|
void init_amd_cacheinfo(struct cpuinfo_x86 *c)
|
2012-10-19 10:59:33 +02:00
|
|
|
{
|
2024-11-27 16:22:47 -08:00
|
|
|
struct cpu_cacheinfo *ci = get_cpu_cacheinfo(c->cpu_index);
|
2012-10-19 10:59:33 +02:00
|
|
|
|
2025-03-24 14:33:24 +01:00
|
|
|
if (boot_cpu_has(X86_FEATURE_TOPOEXT))
|
2024-11-27 16:22:47 -08:00
|
|
|
ci->num_leaves = find_num_cache_leaves(c);
|
2025-03-24 14:33:24 +01:00
|
|
|
else if (c->extended_cpuid_level >= 0x80000006)
|
|
|
|
ci->num_leaves = (cpuid_edx(0x80000006) & 0xf000) ? 4 : 3;
|
2012-10-19 10:59:33 +02:00
|
|
|
}
|
|
|
|
|
2018-09-23 17:33:44 +08:00
|
|
|
void init_hygon_cacheinfo(struct cpuinfo_x86 *c)
|
|
|
|
{
|
2024-11-27 16:22:47 -08:00
|
|
|
struct cpu_cacheinfo *ci = get_cpu_cacheinfo(c->cpu_index);
|
|
|
|
|
|
|
|
ci->num_leaves = find_num_cache_leaves(c);
|
2018-09-23 17:33:44 +08:00
|
|
|
}
|
|
|
|
|
2025-03-24 14:33:18 +01:00
|
|
|
static void intel_cacheinfo_done(struct cpuinfo_x86 *c, unsigned int l3,
|
|
|
|
unsigned int l2, unsigned int l1i, unsigned int l1d)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* If llc_id is still unset, then cpuid_level < 4, which implies
|
2025-04-11 09:04:01 +02:00
|
|
|
* that the only possibility left is SMT. Since CPUID(0x2) doesn't
|
2025-03-24 14:33:18 +01:00
|
|
|
* specify any shared caches and SMT shares all caches, we can
|
|
|
|
* unconditionally set LLC ID to the package ID so that all
|
|
|
|
* threads share it.
|
|
|
|
*/
|
|
|
|
if (c->topo.llc_id == BAD_APICID)
|
|
|
|
c->topo.llc_id = c->topo.pkg_id;
|
|
|
|
|
|
|
|
c->x86_cache_size = l3 ? l3 : (l2 ? l2 : l1i + l1d);
|
|
|
|
|
|
|
|
if (!l2)
|
|
|
|
cpu_detect_cache_sizes(c);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2025-04-11 09:04:01 +02:00
|
|
|
* Legacy Intel CPUID(0x2) path if CPUID(0x4) is not available.
|
2025-03-24 14:33:18 +01:00
|
|
|
*/
|
|
|
|
static void intel_cacheinfo_0x2(struct cpuinfo_x86 *c)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
2023-02-11 00:45:41 +01:00
|
|
|
unsigned int l1i = 0, l1d = 0, l2 = 0, l3 = 0;
|
2025-05-08 17:02:35 +02:00
|
|
|
const struct leaf_0x2_table *desc;
|
2025-03-24 14:33:18 +01:00
|
|
|
union leaf_0x2_regs regs;
|
|
|
|
u8 *ptr;
|
|
|
|
|
|
|
|
if (c->cpuid_level < 2)
|
|
|
|
return;
|
|
|
|
|
2025-05-08 17:02:34 +02:00
|
|
|
cpuid_leaf_0x2(®s);
|
2025-05-08 17:02:35 +02:00
|
|
|
for_each_cpuid_0x2_desc(regs, ptr, desc) {
|
|
|
|
switch (desc->c_type) {
|
|
|
|
case CACHE_L1_INST: l1i += desc->c_size; break;
|
|
|
|
case CACHE_L1_DATA: l1d += desc->c_size; break;
|
|
|
|
case CACHE_L2: l2 += desc->c_size; break;
|
|
|
|
case CACHE_L3: l3 += desc->c_size; break;
|
2025-03-24 14:33:18 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
intel_cacheinfo_done(c, l3, l2, l1i, l1d);
|
|
|
|
}
|
|
|
|
|
2025-03-24 14:33:20 +01:00
|
|
|
static unsigned int calc_cache_topo_id(struct cpuinfo_x86 *c, const struct _cpuid4_info *id4)
|
|
|
|
{
|
|
|
|
unsigned int num_threads_sharing;
|
|
|
|
int index_msb;
|
|
|
|
|
|
|
|
num_threads_sharing = 1 + id4->eax.split.num_threads_sharing;
|
|
|
|
index_msb = get_count_order(num_threads_sharing);
|
|
|
|
return c->topo.apicid & ~((1 << index_msb) - 1);
|
|
|
|
}
|
|
|
|
|
2025-03-24 14:33:19 +01:00
|
|
|
static bool intel_cacheinfo_0x4(struct cpuinfo_x86 *c)
|
2025-03-24 14:33:18 +01:00
|
|
|
{
|
2024-11-27 16:22:47 -08:00
|
|
|
struct cpu_cacheinfo *ci = get_cpu_cacheinfo(c->cpu_index);
|
2025-03-24 14:33:19 +01:00
|
|
|
unsigned int l2_id = BAD_APICID, l3_id = BAD_APICID;
|
|
|
|
unsigned int l1d = 0, l1i = 0, l2 = 0, l3 = 0;
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2025-03-24 14:33:19 +01:00
|
|
|
if (c->cpuid_level < 4)
|
|
|
|
return false;
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2025-03-24 14:33:19 +01:00
|
|
|
/*
|
|
|
|
* There should be at least one leaf. A non-zero value means
|
|
|
|
* that the number of leaves has been previously initialized.
|
|
|
|
*/
|
|
|
|
if (!ci->num_leaves)
|
|
|
|
ci->num_leaves = find_num_cache_leaves(c);
|
2013-06-08 18:48:15 +02:00
|
|
|
|
2025-03-24 14:33:19 +01:00
|
|
|
if (!ci->num_leaves)
|
|
|
|
return false;
|
|
|
|
|
|
|
|
for (int i = 0; i < ci->num_leaves; i++) {
|
|
|
|
struct _cpuid4_info id4 = {};
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = intel_fill_cpuid4_info(i, &id4);
|
|
|
|
if (ret < 0)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
switch (id4.eax.split.level) {
|
|
|
|
case 1:
|
|
|
|
if (id4.eax.split.type == CTYPE_DATA)
|
|
|
|
l1d = id4.size / 1024;
|
|
|
|
else if (id4.eax.split.type == CTYPE_INST)
|
|
|
|
l1i = id4.size / 1024;
|
|
|
|
break;
|
|
|
|
case 2:
|
|
|
|
l2 = id4.size / 1024;
|
2025-03-24 14:33:20 +01:00
|
|
|
l2_id = calc_cache_topo_id(c, &id4);
|
2025-03-24 14:33:19 +01:00
|
|
|
break;
|
|
|
|
case 3:
|
|
|
|
l3 = id4.size / 1024;
|
2025-03-24 14:33:20 +01:00
|
|
|
l3_id = calc_cache_topo_id(c, &id4);
|
2025-03-24 14:33:19 +01:00
|
|
|
break;
|
|
|
|
default:
|
|
|
|
break;
|
2005-04-16 15:20:36 -07:00
|
|
|
}
|
|
|
|
}
|
2025-03-04 09:51:22 +01:00
|
|
|
|
2025-03-24 14:33:19 +01:00
|
|
|
c->topo.l2c_id = l2_id;
|
|
|
|
c->topo.llc_id = (l3_id == BAD_APICID) ? l2_id : l3_id;
|
|
|
|
intel_cacheinfo_done(c, l3, l2, l1i, l1d);
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
void init_intel_cacheinfo(struct cpuinfo_x86 *c)
|
|
|
|
{
|
2025-04-11 09:04:01 +02:00
|
|
|
/* Don't use CPUID(0x2) if CPUID(0x4) is supported. */
|
2025-03-24 14:33:19 +01:00
|
|
|
if (intel_cacheinfo_0x4(c))
|
2025-03-24 14:33:18 +01:00
|
|
|
return;
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2025-03-24 14:33:19 +01:00
|
|
|
intel_cacheinfo_0x2(c);
|
2005-04-16 15:20:36 -07:00
|
|
|
}
|
|
|
|
|
2025-03-24 14:33:24 +01:00
|
|
|
/*
|
2025-04-11 09:04:01 +02:00
|
|
|
* <linux/cacheinfo.h> shared_cpu_map setup, AMD/Hygon
|
2025-03-24 14:33:24 +01:00
|
|
|
*/
|
2015-03-04 12:00:16 +00:00
|
|
|
static int __cache_amd_cpumap_setup(unsigned int cpu, int index,
|
2025-03-24 14:33:11 +01:00
|
|
|
const struct _cpuid4_info *id4)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
2021-03-31 16:00:24 +08:00
|
|
|
struct cpu_cacheinfo *this_cpu_ci;
|
2025-03-24 14:33:02 +01:00
|
|
|
struct cacheinfo *ci;
|
2012-10-19 11:02:09 +02:00
|
|
|
int i, sibling;
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2017-07-31 10:51:59 +02:00
|
|
|
/*
|
|
|
|
* For L3, always use the pre-calculated cpu_llc_shared_mask
|
|
|
|
* to derive shared_cpu_map.
|
|
|
|
*/
|
|
|
|
if (index == 3) {
|
|
|
|
for_each_cpu(i, cpu_llc_shared_mask(cpu)) {
|
|
|
|
this_cpu_ci = get_cpu_cacheinfo(i);
|
|
|
|
if (!this_cpu_ci->info_list)
|
|
|
|
continue;
|
2025-03-24 14:33:24 +01:00
|
|
|
|
2025-03-24 14:33:02 +01:00
|
|
|
ci = this_cpu_ci->info_list + index;
|
2017-07-31 10:51:59 +02:00
|
|
|
for_each_cpu(sibling, cpu_llc_shared_mask(cpu)) {
|
|
|
|
if (!cpu_online(sibling))
|
|
|
|
continue;
|
2025-03-24 14:33:24 +01:00
|
|
|
cpumask_set_cpu(sibling, &ci->shared_cpu_map);
|
2017-07-31 10:51:59 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
} else if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
|
2012-10-19 11:02:09 +02:00
|
|
|
unsigned int apicid, nshared, first, last;
|
|
|
|
|
2025-03-24 14:33:05 +01:00
|
|
|
nshared = id4->eax.split.num_threads_sharing + 1;
|
2023-08-14 10:18:29 +02:00
|
|
|
apicid = cpu_data(cpu).topo.apicid;
|
2012-10-19 11:02:09 +02:00
|
|
|
first = apicid - (apicid % nshared);
|
|
|
|
last = first + nshared - 1;
|
|
|
|
|
|
|
|
for_each_online_cpu(i) {
|
2015-03-04 12:00:16 +00:00
|
|
|
this_cpu_ci = get_cpu_cacheinfo(i);
|
|
|
|
if (!this_cpu_ci->info_list)
|
|
|
|
continue;
|
|
|
|
|
2023-08-14 10:18:29 +02:00
|
|
|
apicid = cpu_data(i).topo.apicid;
|
2012-10-19 11:02:09 +02:00
|
|
|
if ((apicid < first) || (apicid > last))
|
|
|
|
continue;
|
2015-03-04 12:00:16 +00:00
|
|
|
|
2025-03-24 14:33:02 +01:00
|
|
|
ci = this_cpu_ci->info_list + index;
|
2012-10-19 11:02:09 +02:00
|
|
|
|
|
|
|
for_each_online_cpu(sibling) {
|
2023-08-14 10:18:29 +02:00
|
|
|
apicid = cpu_data(sibling).topo.apicid;
|
2012-10-19 11:02:09 +02:00
|
|
|
if ((apicid < first) || (apicid > last))
|
2009-12-09 13:36:45 -05:00
|
|
|
continue;
|
2025-03-24 14:33:24 +01:00
|
|
|
cpumask_set_cpu(sibling, &ci->shared_cpu_map);
|
2009-12-09 13:36:45 -05:00
|
|
|
}
|
2009-09-03 09:41:19 +02:00
|
|
|
}
|
2012-10-19 11:02:09 +02:00
|
|
|
} else
|
|
|
|
return 0;
|
2012-02-08 20:52:29 +01:00
|
|
|
|
2012-10-19 11:02:09 +02:00
|
|
|
return 1;
|
2012-02-08 20:52:29 +01:00
|
|
|
}
|
|
|
|
|
2025-03-24 14:33:24 +01:00
|
|
|
/*
|
2025-04-11 09:04:01 +02:00
|
|
|
* <linux/cacheinfo.h> shared_cpu_map setup, Intel + fallback AMD/Hygon
|
2025-03-24 14:33:24 +01:00
|
|
|
*/
|
2015-03-04 12:00:16 +00:00
|
|
|
static void __cache_cpumap_setup(unsigned int cpu, int index,
|
2025-03-24 14:33:11 +01:00
|
|
|
const struct _cpuid4_info *id4)
|
2012-02-08 20:52:29 +01:00
|
|
|
{
|
2015-03-04 12:00:16 +00:00
|
|
|
struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
|
2025-03-24 14:33:24 +01:00
|
|
|
struct cpuinfo_x86 *c = &cpu_data(cpu);
|
2025-03-24 14:33:02 +01:00
|
|
|
struct cacheinfo *ci, *sibling_ci;
|
2012-02-08 20:52:29 +01:00
|
|
|
unsigned long num_threads_sharing;
|
|
|
|
int index_msb, i;
|
|
|
|
|
2025-03-24 14:33:24 +01:00
|
|
|
if (c->x86_vendor == X86_VENDOR_AMD || c->x86_vendor == X86_VENDOR_HYGON) {
|
2025-03-24 14:33:05 +01:00
|
|
|
if (__cache_amd_cpumap_setup(cpu, index, id4))
|
2012-02-08 20:52:29 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2025-03-24 14:33:02 +01:00
|
|
|
ci = this_cpu_ci->info_list + index;
|
2025-03-24 14:33:05 +01:00
|
|
|
num_threads_sharing = 1 + id4->eax.split.num_threads_sharing;
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2025-03-24 14:33:02 +01:00
|
|
|
cpumask_set_cpu(cpu, &ci->shared_cpu_map);
|
2005-04-16 15:20:36 -07:00
|
|
|
if (num_threads_sharing == 1)
|
2015-03-04 12:00:16 +00:00
|
|
|
return;
|
2005-07-28 21:15:46 -07:00
|
|
|
|
2015-03-04 12:00:16 +00:00
|
|
|
index_msb = get_count_order(num_threads_sharing);
|
2011-07-24 09:46:08 +00:00
|
|
|
|
2015-03-04 12:00:16 +00:00
|
|
|
for_each_online_cpu(i)
|
2023-08-14 10:18:29 +02:00
|
|
|
if (cpu_data(i).topo.apicid >> index_msb == c->topo.apicid >> index_msb) {
|
2015-03-04 12:00:16 +00:00
|
|
|
struct cpu_cacheinfo *sib_cpu_ci = get_cpu_cacheinfo(i);
|
2007-10-18 03:05:16 -07:00
|
|
|
|
2025-03-24 14:33:24 +01:00
|
|
|
/* Skip if itself or no cacheinfo */
|
2015-03-04 12:00:16 +00:00
|
|
|
if (i == cpu || !sib_cpu_ci->info_list)
|
2025-03-24 14:33:24 +01:00
|
|
|
continue;
|
|
|
|
|
2025-03-24 14:33:02 +01:00
|
|
|
sibling_ci = sib_cpu_ci->info_list + index;
|
|
|
|
cpumask_set_cpu(i, &ci->shared_cpu_map);
|
|
|
|
cpumask_set_cpu(cpu, &sibling_ci->shared_cpu_map);
|
2007-10-18 03:05:16 -07:00
|
|
|
}
|
2008-12-16 17:34:03 -08:00
|
|
|
}
|
|
|
|
|
2025-03-24 14:33:11 +01:00
|
|
|
static void ci_info_init(struct cacheinfo *ci, const struct _cpuid4_info *id4,
|
2025-03-24 14:33:07 +01:00
|
|
|
struct amd_northbridge *nb)
|
2008-12-16 17:34:03 -08:00
|
|
|
{
|
2025-03-24 14:33:05 +01:00
|
|
|
ci->id = id4->id;
|
2025-03-24 14:33:04 +01:00
|
|
|
ci->attributes = CACHE_ID;
|
2025-03-24 14:33:05 +01:00
|
|
|
ci->level = id4->eax.split.level;
|
|
|
|
ci->type = cache_type_map[id4->eax.split.type];
|
|
|
|
ci->coherency_line_size = id4->ebx.split.coherency_line_size + 1;
|
|
|
|
ci->ways_of_associativity = id4->ebx.split.ways_of_associativity + 1;
|
|
|
|
ci->size = id4->size;
|
|
|
|
ci->number_of_sets = id4->ecx.split.number_of_sets + 1;
|
|
|
|
ci->physical_line_partition = id4->ebx.split.physical_line_partition + 1;
|
2025-03-24 14:33:07 +01:00
|
|
|
ci->priv = nb;
|
2005-04-16 15:20:36 -07:00
|
|
|
}
|
|
|
|
|
2021-08-31 13:48:34 +02:00
|
|
|
int init_cache_level(unsigned int cpu)
|
2008-04-08 11:43:02 -07:00
|
|
|
{
|
2024-11-27 16:22:47 -08:00
|
|
|
struct cpu_cacheinfo *ci = get_cpu_cacheinfo(cpu);
|
2008-04-08 11:43:02 -07:00
|
|
|
|
2024-11-27 16:22:47 -08:00
|
|
|
/* There should be at least one leaf. */
|
|
|
|
if (!ci->num_leaves)
|
2005-04-16 15:20:36 -07:00
|
|
|
return -ENOENT;
|
2024-11-27 16:22:47 -08:00
|
|
|
|
2005-04-16 15:20:36 -07:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2016-10-22 06:19:50 -07:00
|
|
|
/*
|
2025-04-11 09:04:01 +02:00
|
|
|
* The max shared threads number comes from CPUID(0x4) EAX[25-14] with input
|
2016-10-22 06:19:50 -07:00
|
|
|
* ECX as cache index. Then right shift apicid by the number's order to get
|
|
|
|
* cache id for this cache node.
|
|
|
|
*/
|
2025-03-24 14:33:11 +01:00
|
|
|
static void get_cache_id(int cpu, struct _cpuid4_info *id4)
|
2016-10-22 06:19:50 -07:00
|
|
|
{
|
|
|
|
struct cpuinfo_x86 *c = &cpu_data(cpu);
|
|
|
|
unsigned long num_threads_sharing;
|
|
|
|
int index_msb;
|
|
|
|
|
2025-03-24 14:33:05 +01:00
|
|
|
num_threads_sharing = 1 + id4->eax.split.num_threads_sharing;
|
2016-10-22 06:19:50 -07:00
|
|
|
index_msb = get_count_order(num_threads_sharing);
|
2025-03-24 14:33:05 +01:00
|
|
|
id4->id = c->topo.apicid >> index_msb;
|
2016-10-22 06:19:50 -07:00
|
|
|
}
|
|
|
|
|
2021-08-31 13:48:34 +02:00
|
|
|
int populate_cache_leaves(unsigned int cpu)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
2015-03-04 12:00:16 +00:00
|
|
|
struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
|
2025-03-24 14:33:02 +01:00
|
|
|
struct cacheinfo *ci = this_cpu_ci->info_list;
|
2025-03-24 14:33:10 +01:00
|
|
|
u8 cpu_vendor = boot_cpu_data.x86_vendor;
|
|
|
|
struct amd_northbridge *nb = NULL;
|
2025-03-24 14:33:11 +01:00
|
|
|
struct _cpuid4_info id4 = {};
|
2025-03-24 14:33:10 +01:00
|
|
|
int idx, ret;
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2015-03-04 12:00:16 +00:00
|
|
|
for (idx = 0; idx < this_cpu_ci->num_leaves; idx++) {
|
2025-03-24 14:33:10 +01:00
|
|
|
ret = fill_cpuid4_info(idx, &id4);
|
2015-03-04 12:00:16 +00:00
|
|
|
if (ret)
|
|
|
|
return ret;
|
2025-03-24 14:33:10 +01:00
|
|
|
|
2025-03-24 14:33:05 +01:00
|
|
|
get_cache_id(cpu, &id4);
|
2025-03-24 14:33:10 +01:00
|
|
|
|
|
|
|
if (cpu_vendor == X86_VENDOR_AMD || cpu_vendor == X86_VENDOR_HYGON)
|
|
|
|
nb = amd_init_l3_cache(idx);
|
|
|
|
|
2025-03-24 14:33:07 +01:00
|
|
|
ci_info_init(ci++, &id4, nb);
|
2025-03-24 14:33:05 +01:00
|
|
|
__cache_cpumap_setup(cpu, idx, &id4);
|
2005-04-16 15:20:36 -07:00
|
|
|
}
|
2016-10-28 09:45:28 +01:00
|
|
|
|
2025-03-24 14:33:24 +01:00
|
|
|
this_cpu_ci->cpu_map_populated = true;
|
2008-07-15 17:09:03 +09:00
|
|
|
return 0;
|
2005-04-16 15:20:36 -07:00
|
|
|
}
|
2022-11-02 08:47:03 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Disable and enable caches. Needed for changing MTRRs and the PAT MSR.
|
|
|
|
*
|
|
|
|
* Since we are disabling the cache don't allow any interrupts,
|
|
|
|
* they would run extremely slow and would only increase the pain.
|
|
|
|
*
|
|
|
|
* The caller must ensure that local interrupts are disabled and
|
|
|
|
* are reenabled after cache_enable() has been called.
|
|
|
|
*/
|
|
|
|
static unsigned long saved_cr4;
|
|
|
|
static DEFINE_RAW_SPINLOCK(cache_disable_lock);
|
|
|
|
|
2025-03-24 14:33:21 +01:00
|
|
|
/*
|
|
|
|
* Cache flushing is the most time-consuming step when programming the
|
|
|
|
* MTRRs. On many Intel CPUs without known erratas, it can be skipped
|
|
|
|
* if the CPU declares cache self-snooping support.
|
|
|
|
*/
|
|
|
|
static void maybe_flush_caches(void)
|
|
|
|
{
|
|
|
|
if (!static_cpu_has(X86_FEATURE_SELFSNOOP))
|
|
|
|
wbinvd();
|
|
|
|
}
|
|
|
|
|
2022-11-02 08:47:03 +01:00
|
|
|
void cache_disable(void) __acquires(cache_disable_lock)
|
|
|
|
{
|
|
|
|
unsigned long cr0;
|
|
|
|
|
|
|
|
/*
|
2025-03-24 14:33:24 +01:00
|
|
|
* This is not ideal since the cache is only flushed/disabled
|
|
|
|
* for this CPU while the MTRRs are changed, but changing this
|
|
|
|
* requires more invasive changes to the way the kernel boots.
|
2022-11-02 08:47:03 +01:00
|
|
|
*/
|
|
|
|
raw_spin_lock(&cache_disable_lock);
|
|
|
|
|
|
|
|
/* Enter the no-fill (CD=1, NW=0) cache mode and flush caches. */
|
|
|
|
cr0 = read_cr0() | X86_CR0_CD;
|
|
|
|
write_cr0(cr0);
|
|
|
|
|
2025-03-24 14:33:21 +01:00
|
|
|
maybe_flush_caches();
|
2022-11-02 08:47:03 +01:00
|
|
|
|
|
|
|
/* Save value of CR4 and clear Page Global Enable (bit 7) */
|
|
|
|
if (cpu_feature_enabled(X86_FEATURE_PGE)) {
|
|
|
|
saved_cr4 = __read_cr4();
|
|
|
|
__write_cr4(saved_cr4 & ~X86_CR4_PGE);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Flush all TLBs via a mov %cr3, %reg; mov %reg, %cr3 */
|
|
|
|
count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
|
|
|
|
flush_tlb_local();
|
|
|
|
|
|
|
|
if (cpu_feature_enabled(X86_FEATURE_MTRR))
|
|
|
|
mtrr_disable();
|
|
|
|
|
2025-03-24 14:33:21 +01:00
|
|
|
maybe_flush_caches();
|
2022-11-02 08:47:03 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
void cache_enable(void) __releases(cache_disable_lock)
|
|
|
|
{
|
|
|
|
/* Flush TLBs (no need to flush caches - they are disabled) */
|
|
|
|
count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL);
|
|
|
|
flush_tlb_local();
|
|
|
|
|
|
|
|
if (cpu_feature_enabled(X86_FEATURE_MTRR))
|
|
|
|
mtrr_enable();
|
|
|
|
|
|
|
|
/* Enable caches */
|
|
|
|
write_cr0(read_cr0() & ~X86_CR0_CD);
|
|
|
|
|
|
|
|
/* Restore value of CR4 */
|
|
|
|
if (cpu_feature_enabled(X86_FEATURE_PGE))
|
|
|
|
__write_cr4(saved_cr4);
|
|
|
|
|
|
|
|
raw_spin_unlock(&cache_disable_lock);
|
|
|
|
}
|
2022-11-02 08:47:04 +01:00
|
|
|
|
2022-11-02 08:47:09 +01:00
|
|
|
static void cache_cpu_init(void)
|
2022-11-02 08:47:04 +01:00
|
|
|
{
|
|
|
|
unsigned long flags;
|
|
|
|
|
|
|
|
local_irq_save(flags);
|
|
|
|
|
2024-01-24 15:06:50 +02:00
|
|
|
if (memory_caching_control & CACHE_MTRR) {
|
|
|
|
cache_disable();
|
2022-11-02 08:47:04 +01:00
|
|
|
mtrr_generic_set_state();
|
2024-01-24 15:06:50 +02:00
|
|
|
cache_enable();
|
|
|
|
}
|
2022-11-02 08:47:04 +01:00
|
|
|
|
|
|
|
if (memory_caching_control & CACHE_PAT)
|
2022-11-02 08:47:10 +01:00
|
|
|
pat_cpu_init();
|
2022-11-02 08:47:04 +01:00
|
|
|
|
|
|
|
local_irq_restore(flags);
|
|
|
|
}
|
2022-11-02 08:47:08 +01:00
|
|
|
|
2022-11-02 08:47:11 +01:00
|
|
|
static bool cache_aps_delayed_init = true;
|
2022-11-02 08:47:08 +01:00
|
|
|
|
|
|
|
void set_cache_aps_delayed_init(bool val)
|
|
|
|
{
|
|
|
|
cache_aps_delayed_init = val;
|
|
|
|
}
|
|
|
|
|
|
|
|
bool get_cache_aps_delayed_init(void)
|
|
|
|
{
|
|
|
|
return cache_aps_delayed_init;
|
|
|
|
}
|
2022-11-02 08:47:09 +01:00
|
|
|
|
|
|
|
static int cache_rendezvous_handler(void *unused)
|
|
|
|
{
|
|
|
|
if (get_cache_aps_delayed_init() || !cpu_online(smp_processor_id()))
|
|
|
|
cache_cpu_init();
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
void __init cache_bp_init(void)
|
|
|
|
{
|
|
|
|
mtrr_bp_init();
|
2022-11-02 08:47:10 +01:00
|
|
|
pat_bp_init();
|
2022-11-02 08:47:09 +01:00
|
|
|
|
|
|
|
if (memory_caching_control)
|
|
|
|
cache_cpu_init();
|
|
|
|
}
|
|
|
|
|
|
|
|
void cache_bp_restore(void)
|
|
|
|
{
|
|
|
|
if (memory_caching_control)
|
|
|
|
cache_cpu_init();
|
|
|
|
}
|
|
|
|
|
2023-05-12 23:07:14 +02:00
|
|
|
static int cache_ap_online(unsigned int cpu)
|
2022-11-02 08:47:09 +01:00
|
|
|
{
|
2023-05-12 23:07:14 +02:00
|
|
|
cpumask_set_cpu(cpu, cpu_cacheinfo_mask);
|
|
|
|
|
2022-11-02 08:47:09 +01:00
|
|
|
if (!memory_caching_control || get_cache_aps_delayed_init())
|
2022-11-02 08:47:11 +01:00
|
|
|
return 0;
|
2022-11-02 08:47:09 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Ideally we should hold mtrr_mutex here to avoid MTRR entries
|
|
|
|
* changed, but this routine will be called in CPU boot time,
|
|
|
|
* holding the lock breaks it.
|
|
|
|
*
|
|
|
|
* This routine is called in two cases:
|
|
|
|
*
|
|
|
|
* 1. very early time of software resume, when there absolutely
|
|
|
|
* isn't MTRR entry changes;
|
|
|
|
*
|
|
|
|
* 2. CPU hotadd time. We let mtrr_add/del_page hold cpuhotplug
|
|
|
|
* lock to prevent MTRR entry changes
|
|
|
|
*/
|
|
|
|
stop_machine_from_inactive_cpu(cache_rendezvous_handler, NULL,
|
2023-05-12 23:07:14 +02:00
|
|
|
cpu_cacheinfo_mask);
|
2022-11-02 08:47:11 +01:00
|
|
|
|
|
|
|
return 0;
|
2022-11-02 08:47:09 +01:00
|
|
|
}
|
|
|
|
|
2023-05-12 23:07:14 +02:00
|
|
|
static int cache_ap_offline(unsigned int cpu)
|
|
|
|
{
|
|
|
|
cpumask_clear_cpu(cpu, cpu_cacheinfo_mask);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2022-11-02 08:47:09 +01:00
|
|
|
/*
|
|
|
|
* Delayed cache initialization for all AP's
|
|
|
|
*/
|
|
|
|
void cache_aps_init(void)
|
|
|
|
{
|
|
|
|
if (!memory_caching_control || !get_cache_aps_delayed_init())
|
|
|
|
return;
|
|
|
|
|
|
|
|
stop_machine(cache_rendezvous_handler, NULL, cpu_online_mask);
|
|
|
|
set_cache_aps_delayed_init(false);
|
|
|
|
}
|
2022-11-02 08:47:11 +01:00
|
|
|
|
|
|
|
static int __init cache_ap_register(void)
|
|
|
|
{
|
2023-05-12 23:07:14 +02:00
|
|
|
zalloc_cpumask_var(&cpu_cacheinfo_mask, GFP_KERNEL);
|
|
|
|
cpumask_set_cpu(smp_processor_id(), cpu_cacheinfo_mask);
|
|
|
|
|
2022-11-02 08:47:11 +01:00
|
|
|
cpuhp_setup_state_nocalls(CPUHP_AP_CACHECTRL_STARTING,
|
|
|
|
"x86/cachectrl:starting",
|
2023-05-12 23:07:14 +02:00
|
|
|
cache_ap_online, cache_ap_offline);
|
2022-11-02 08:47:11 +01:00
|
|
|
return 0;
|
|
|
|
}
|
2023-05-12 23:07:14 +02:00
|
|
|
early_initcall(cache_ap_register);
|