Mitigate VMSCAPE issue with indirect branch predictor flushes

-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmi58uwACgkQaDWVMHDJ
 krCIBxAAj/8/RBSSK6ULtDLKbmpRKMVpwEE1Yt8vK95Z/50gVSidtQtofIet+CPY
 NeN5Y4Aip3w/JFoIQafop8ZASOFjNjhqVEjE75RdtdDacQCyluqWg/2PrJpKkBVv
 OWTVVVPD9aSZAY0Tk/79ABV8Fbp/EBID5mhJ40GrBhkLZku2ALDj1eQINEjoBedB
 2+sCO1MMqynlmglt8FltwFtl0rHgtlhGviuc/QmsxH9FrLIGBlgciW4Rma+LOtAE
 4iD1Ij/ICuwA78kPAgrxvs+B1w3QGZhTPvOHjj0c9kKM3jBqphWoMWFUKbFfUK8i
 6rM0jZMB8iaUcKJ+Ra+stNmvddLkbya7J9wwHgQWi/kxEMZMxbbbOXwfl1Ya8sha
 n/kKxm8Lsrjex3RTnd1hoXvGY2blr0dZ97jfjgOqVuYBZih5yWzixQbuf3TAbCZO
 Kb+fbfC7EsI1N0zuFh42Q1hT0zxYYshNIxtGPjDwspJRkHvhmNjNswXr7sccXhFo
 P5araDcYN0ul85SlAhQRMB17mle47ETSgh04LRM4Rq3rbweXzghoRj//WcY4YqYS
 qSJEFzSC7hVwNabG+NBexUaZL8bZRMoE7qx5lmo0q+tTMIQkEG2rqrFz9b1d4JON
 g6aKyrD8YyRCoBjZAF0tjCwhQgxSKXGsVwzBYl0+RcY+1Lo1L2U=
 =8wrr
 -----END PGP SIGNATURE-----

Merge tag 'vmscape-for-linus-20250904' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull vmescape mitigation fixes from Dave Hansen:
 "Mitigate vmscape issue with indirect branch predictor flushes.

  vmscape is a vulnerability that essentially takes Spectre-v2 and
  attacks host userspace from a guest. It particularly affects
  hypervisors like QEMU.

  Even if a hypervisor may not have any sensitive data like disk
  encryption keys, guest-userspace may be able to attack the
  guest-kernel using the hypervisor as a confused deputy.

  There are many ways to mitigate vmscape using the existing Spectre-v2
  defenses like IBRS variants or the IBPB flushes. This series focuses
  solely on IBPB because it works universally across vendors and all
  vulnerable processors. Further work doing vendor and model-specific
  optimizations can build on top of this if needed / wanted.

  Do the normal issue mitigation dance:

   - Add the CPU bug boilerplate

   - Add a list of vulnerable CPUs

   - Use IBPB to flush the branch predictors after running guests"

* tag 'vmscape-for-linus-20250904' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/vmscape: Add old Intel CPUs to affected list
  x86/vmscape: Warn when STIBP is disabled with SMT
  x86/bugs: Move cpu_bugs_smt_update() down
  x86/vmscape: Enable the mitigation
  x86/vmscape: Add conditional IBPB mitigation
  x86/vmscape: Enumerate VMSCAPE bug
  Documentation/hw-vuln: Add VMSCAPE documentation
This commit is contained in:
Linus Torvalds 2025-09-10 20:52:16 -07:00
commit 223ba8ee0a
13 changed files with 414 additions and 113 deletions

View file

@ -586,6 +586,7 @@ What: /sys/devices/system/cpu/vulnerabilities
/sys/devices/system/cpu/vulnerabilities/srbds
/sys/devices/system/cpu/vulnerabilities/tsa
/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
/sys/devices/system/cpu/vulnerabilities/vmscape
Date: January 2018
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Description: Information about CPU vulnerabilities

View file

@ -26,3 +26,4 @@ are configurable at compile, boot or run time.
rsb
old_microcode
indirect-target-selection
vmscape

View file

@ -0,0 +1,110 @@
.. SPDX-License-Identifier: GPL-2.0
VMSCAPE
=======
VMSCAPE is a vulnerability that may allow a guest to influence the branch
prediction in host userspace. It particularly affects hypervisors like QEMU.
Even if a hypervisor may not have any sensitive data like disk encryption keys,
guest-userspace may be able to attack the guest-kernel using the hypervisor as
a confused deputy.
Affected processors
-------------------
The following CPU families are affected by VMSCAPE:
**Intel processors:**
- Skylake generation (Parts without Enhanced-IBRS)
- Cascade Lake generation - (Parts affected by ITS guest/host separation)
- Alder Lake and newer (Parts affected by BHI)
Note that, BHI affected parts that use BHB clearing software mitigation e.g.
Icelake are not vulnerable to VMSCAPE.
**AMD processors:**
- Zen series (families 0x17, 0x19, 0x1a)
** Hygon processors:**
- Family 0x18
Mitigation
----------
Conditional IBPB
----------------
Kernel tracks when a CPU has run a potentially malicious guest and issues an
IBPB before the first exit to userspace after VM-exit. If userspace did not run
between VM-exit and the next VM-entry, no IBPB is issued.
Note that the existing userspace mitigation against Spectre-v2 is effective in
protecting the userspace. They are insufficient to protect the userspace VMMs
from a malicious guest. This is because Spectre-v2 mitigations are applied at
context switch time, while the userspace VMM can run after a VM-exit without a
context switch.
Vulnerability enumeration and mitigation is not applied inside a guest. This is
because nested hypervisors should already be deploying IBPB to isolate
themselves from nested guests.
SMT considerations
------------------
When Simultaneous Multi-Threading (SMT) is enabled, hypervisors can be
vulnerable to cross-thread attacks. For complete protection against VMSCAPE
attacks in SMT environments, STIBP should be enabled.
The kernel will issue a warning if SMT is enabled without adequate STIBP
protection. Warning is not issued when:
- SMT is disabled
- STIBP is enabled system-wide
- Intel eIBRS is enabled (which implies STIBP protection)
System information and options
------------------------------
The sysfs file showing VMSCAPE mitigation status is:
/sys/devices/system/cpu/vulnerabilities/vmscape
The possible values in this file are:
* 'Not affected':
The processor is not vulnerable to VMSCAPE attacks.
* 'Vulnerable':
The processor is vulnerable and no mitigation has been applied.
* 'Mitigation: IBPB before exit to userspace':
Conditional IBPB mitigation is enabled. The kernel tracks when a CPU has
run a potentially malicious guest and issues an IBPB before the first
exit to userspace after VM-exit.
* 'Mitigation: IBPB on VMEXIT':
IBPB is issued on every VM-exit. This occurs when other mitigations like
RETBLEED or SRSO are already issuing IBPB on VM-exit.
Mitigation control on the kernel command line
----------------------------------------------
The mitigation can be controlled via the ``vmscape=`` command line parameter:
* ``vmscape=off``:
Disable the VMSCAPE mitigation.
* ``vmscape=ibpb``:
Enable conditional IBPB mitigation (default when CONFIG_MITIGATION_VMSCAPE=y).
* ``vmscape=force``:
Force vulnerability detection and mitigation even on processors that are
not known to be affected.

View file

@ -3829,6 +3829,7 @@
srbds=off [X86,INTEL]
ssbd=force-off [ARM64]
tsx_async_abort=off [X86]
vmscape=off [X86]
Exceptions:
This does not have any effect on
@ -8041,6 +8042,16 @@
vmpoff= [KNL,S390] Perform z/VM CP command after power off.
Format: <command>
vmscape= [X86] Controls mitigation for VMscape attacks.
VMscape attacks can leak information from a userspace
hypervisor to a guest via speculative side-channels.
off - disable the mitigation
ibpb - use Indirect Branch Prediction Barrier
(IBPB) mitigation (default)
force - force vulnerability detection even on
unaffected processors
vsyscall= [X86-64,EARLY]
Controls the behavior of vsyscalls (i.e. calls to
fixed addresses of 0xffffffffff600x00 from legacy

View file

@ -2701,6 +2701,15 @@ config MITIGATION_TSA
security vulnerability on AMD CPUs which can lead to forwarding of
invalid info to subsequent instructions and thus can affect their
timing and thereby cause a leakage.
config MITIGATION_VMSCAPE
bool "Mitigate VMSCAPE"
depends on KVM
default y
help
Enable mitigation for VMSCAPE attacks. VMSCAPE is a hardware security
vulnerability on Intel and AMD CPUs that may allow a guest to do
Spectre v2 style attacks on userspace hypervisor.
endif
config ARCH_HAS_ADD_PAGES

View file

@ -495,6 +495,7 @@
#define X86_FEATURE_TSA_SQ_NO (21*32+11) /* AMD CPU not vulnerable to TSA-SQ */
#define X86_FEATURE_TSA_L1_NO (21*32+12) /* AMD CPU not vulnerable to TSA-L1 */
#define X86_FEATURE_CLEAR_CPU_BUF_VM (21*32+13) /* Clear CPU buffers using VERW before VMRUN */
#define X86_FEATURE_IBPB_EXIT_TO_USER (21*32+14) /* Use IBPB on exit-to-userspace, see VMSCAPE bug */
/*
* BUG word(s)
@ -551,4 +552,5 @@
#define X86_BUG_ITS X86_BUG( 1*32+ 7) /* "its" CPU is affected by Indirect Target Selection */
#define X86_BUG_ITS_NATIVE_ONLY X86_BUG( 1*32+ 8) /* "its_native_only" CPU is affected by ITS, VMX is not affected */
#define X86_BUG_TSA X86_BUG( 1*32+ 9) /* "tsa" CPU is affected by Transient Scheduler Attacks */
#define X86_BUG_VMSCAPE X86_BUG( 1*32+10) /* "vmscape" CPU is affected by VMSCAPE attacks from guests */
#endif /* _ASM_X86_CPUFEATURES_H */

View file

@ -93,6 +93,13 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
* 8 (ia32) bits.
*/
choose_random_kstack_offset(rdtsc());
/* Avoid unnecessary reads of 'x86_ibpb_exit_to_user' */
if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER) &&
this_cpu_read(x86_ibpb_exit_to_user)) {
indirect_branch_prediction_barrier();
this_cpu_write(x86_ibpb_exit_to_user, false);
}
}
#define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare

View file

@ -530,6 +530,8 @@ void alternative_msr_write(unsigned int msr, u64 val, unsigned int feature)
: "memory");
}
DECLARE_PER_CPU(bool, x86_ibpb_exit_to_user);
static inline void indirect_branch_prediction_barrier(void)
{
asm_inline volatile(ALTERNATIVE("", "call write_ibpb", X86_FEATURE_IBPB)

View file

@ -96,6 +96,9 @@ static void __init its_update_mitigation(void);
static void __init its_apply_mitigation(void);
static void __init tsa_select_mitigation(void);
static void __init tsa_apply_mitigation(void);
static void __init vmscape_select_mitigation(void);
static void __init vmscape_update_mitigation(void);
static void __init vmscape_apply_mitigation(void);
/* The base value of the SPEC_CTRL MSR without task-specific bits set */
u64 x86_spec_ctrl_base;
@ -105,6 +108,14 @@ EXPORT_SYMBOL_GPL(x86_spec_ctrl_base);
DEFINE_PER_CPU(u64, x86_spec_ctrl_current);
EXPORT_PER_CPU_SYMBOL_GPL(x86_spec_ctrl_current);
/*
* Set when the CPU has run a potentially malicious guest. An IBPB will
* be needed to before running userspace. That IBPB will flush the branch
* predictor content.
*/
DEFINE_PER_CPU(bool, x86_ibpb_exit_to_user);
EXPORT_PER_CPU_SYMBOL_GPL(x86_ibpb_exit_to_user);
u64 x86_pred_cmd __ro_after_init = PRED_CMD_IBPB;
static u64 __ro_after_init x86_arch_cap_msr;
@ -262,6 +273,7 @@ void __init cpu_select_mitigations(void)
its_select_mitigation();
bhi_select_mitigation();
tsa_select_mitigation();
vmscape_select_mitigation();
/*
* After mitigations are selected, some may need to update their
@ -293,6 +305,7 @@ void __init cpu_select_mitigations(void)
bhi_update_mitigation();
/* srso_update_mitigation() depends on retbleed_update_mitigation(). */
srso_update_mitigation();
vmscape_update_mitigation();
spectre_v1_apply_mitigation();
spectre_v2_apply_mitigation();
@ -310,6 +323,7 @@ void __init cpu_select_mitigations(void)
its_apply_mitigation();
bhi_apply_mitigation();
tsa_apply_mitigation();
vmscape_apply_mitigation();
}
/*
@ -2538,88 +2552,6 @@ static void update_mds_branch_idle(void)
}
}
#define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n"
#define TAA_MSG_SMT "TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.\n"
#define MMIO_MSG_SMT "MMIO Stale Data CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/processor_mmio_stale_data.html for more details.\n"
void cpu_bugs_smt_update(void)
{
mutex_lock(&spec_ctrl_mutex);
if (sched_smt_active() && unprivileged_ebpf_enabled() &&
spectre_v2_enabled == SPECTRE_V2_EIBRS_LFENCE)
pr_warn_once(SPECTRE_V2_EIBRS_LFENCE_EBPF_SMT_MSG);
switch (spectre_v2_user_stibp) {
case SPECTRE_V2_USER_NONE:
break;
case SPECTRE_V2_USER_STRICT:
case SPECTRE_V2_USER_STRICT_PREFERRED:
update_stibp_strict();
break;
case SPECTRE_V2_USER_PRCTL:
case SPECTRE_V2_USER_SECCOMP:
update_indir_branch_cond();
break;
}
switch (mds_mitigation) {
case MDS_MITIGATION_FULL:
case MDS_MITIGATION_AUTO:
case MDS_MITIGATION_VMWERV:
if (sched_smt_active() && !boot_cpu_has(X86_BUG_MSBDS_ONLY))
pr_warn_once(MDS_MSG_SMT);
update_mds_branch_idle();
break;
case MDS_MITIGATION_OFF:
break;
}
switch (taa_mitigation) {
case TAA_MITIGATION_VERW:
case TAA_MITIGATION_AUTO:
case TAA_MITIGATION_UCODE_NEEDED:
if (sched_smt_active())
pr_warn_once(TAA_MSG_SMT);
break;
case TAA_MITIGATION_TSX_DISABLED:
case TAA_MITIGATION_OFF:
break;
}
switch (mmio_mitigation) {
case MMIO_MITIGATION_VERW:
case MMIO_MITIGATION_AUTO:
case MMIO_MITIGATION_UCODE_NEEDED:
if (sched_smt_active())
pr_warn_once(MMIO_MSG_SMT);
break;
case MMIO_MITIGATION_OFF:
break;
}
switch (tsa_mitigation) {
case TSA_MITIGATION_USER_KERNEL:
case TSA_MITIGATION_VM:
case TSA_MITIGATION_AUTO:
case TSA_MITIGATION_FULL:
/*
* TSA-SQ can potentially lead to info leakage between
* SMT threads.
*/
if (sched_smt_active())
static_branch_enable(&cpu_buf_idle_clear);
else
static_branch_disable(&cpu_buf_idle_clear);
break;
case TSA_MITIGATION_NONE:
case TSA_MITIGATION_UCODE_NEEDED:
break;
}
mutex_unlock(&spec_ctrl_mutex);
}
#undef pr_fmt
#define pr_fmt(fmt) "Speculative Store Bypass: " fmt
@ -3330,9 +3262,185 @@ static void __init srso_apply_mitigation(void)
}
}
#undef pr_fmt
#define pr_fmt(fmt) "VMSCAPE: " fmt
enum vmscape_mitigations {
VMSCAPE_MITIGATION_NONE,
VMSCAPE_MITIGATION_AUTO,
VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER,
VMSCAPE_MITIGATION_IBPB_ON_VMEXIT,
};
static const char * const vmscape_strings[] = {
[VMSCAPE_MITIGATION_NONE] = "Vulnerable",
/* [VMSCAPE_MITIGATION_AUTO] */
[VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER] = "Mitigation: IBPB before exit to userspace",
[VMSCAPE_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT",
};
static enum vmscape_mitigations vmscape_mitigation __ro_after_init =
IS_ENABLED(CONFIG_MITIGATION_VMSCAPE) ? VMSCAPE_MITIGATION_AUTO : VMSCAPE_MITIGATION_NONE;
static int __init vmscape_parse_cmdline(char *str)
{
if (!str)
return -EINVAL;
if (!strcmp(str, "off")) {
vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
} else if (!strcmp(str, "ibpb")) {
vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER;
} else if (!strcmp(str, "force")) {
setup_force_cpu_bug(X86_BUG_VMSCAPE);
vmscape_mitigation = VMSCAPE_MITIGATION_AUTO;
} else {
pr_err("Ignoring unknown vmscape=%s option.\n", str);
}
return 0;
}
early_param("vmscape", vmscape_parse_cmdline);
static void __init vmscape_select_mitigation(void)
{
if (cpu_mitigations_off() ||
!boot_cpu_has_bug(X86_BUG_VMSCAPE) ||
!boot_cpu_has(X86_FEATURE_IBPB)) {
vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
return;
}
if (vmscape_mitigation == VMSCAPE_MITIGATION_AUTO)
vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER;
}
static void __init vmscape_update_mitigation(void)
{
if (!boot_cpu_has_bug(X86_BUG_VMSCAPE))
return;
if (retbleed_mitigation == RETBLEED_MITIGATION_IBPB ||
srso_mitigation == SRSO_MITIGATION_IBPB_ON_VMEXIT)
vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_ON_VMEXIT;
pr_info("%s\n", vmscape_strings[vmscape_mitigation]);
}
static void __init vmscape_apply_mitigation(void)
{
if (vmscape_mitigation == VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER)
setup_force_cpu_cap(X86_FEATURE_IBPB_EXIT_TO_USER);
}
#undef pr_fmt
#define pr_fmt(fmt) fmt
#define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n"
#define TAA_MSG_SMT "TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.\n"
#define MMIO_MSG_SMT "MMIO Stale Data CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/processor_mmio_stale_data.html for more details.\n"
#define VMSCAPE_MSG_SMT "VMSCAPE: SMT on, STIBP is required for full protection. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/vmscape.html for more details.\n"
void cpu_bugs_smt_update(void)
{
mutex_lock(&spec_ctrl_mutex);
if (sched_smt_active() && unprivileged_ebpf_enabled() &&
spectre_v2_enabled == SPECTRE_V2_EIBRS_LFENCE)
pr_warn_once(SPECTRE_V2_EIBRS_LFENCE_EBPF_SMT_MSG);
switch (spectre_v2_user_stibp) {
case SPECTRE_V2_USER_NONE:
break;
case SPECTRE_V2_USER_STRICT:
case SPECTRE_V2_USER_STRICT_PREFERRED:
update_stibp_strict();
break;
case SPECTRE_V2_USER_PRCTL:
case SPECTRE_V2_USER_SECCOMP:
update_indir_branch_cond();
break;
}
switch (mds_mitigation) {
case MDS_MITIGATION_FULL:
case MDS_MITIGATION_AUTO:
case MDS_MITIGATION_VMWERV:
if (sched_smt_active() && !boot_cpu_has(X86_BUG_MSBDS_ONLY))
pr_warn_once(MDS_MSG_SMT);
update_mds_branch_idle();
break;
case MDS_MITIGATION_OFF:
break;
}
switch (taa_mitigation) {
case TAA_MITIGATION_VERW:
case TAA_MITIGATION_AUTO:
case TAA_MITIGATION_UCODE_NEEDED:
if (sched_smt_active())
pr_warn_once(TAA_MSG_SMT);
break;
case TAA_MITIGATION_TSX_DISABLED:
case TAA_MITIGATION_OFF:
break;
}
switch (mmio_mitigation) {
case MMIO_MITIGATION_VERW:
case MMIO_MITIGATION_AUTO:
case MMIO_MITIGATION_UCODE_NEEDED:
if (sched_smt_active())
pr_warn_once(MMIO_MSG_SMT);
break;
case MMIO_MITIGATION_OFF:
break;
}
switch (tsa_mitigation) {
case TSA_MITIGATION_USER_KERNEL:
case TSA_MITIGATION_VM:
case TSA_MITIGATION_AUTO:
case TSA_MITIGATION_FULL:
/*
* TSA-SQ can potentially lead to info leakage between
* SMT threads.
*/
if (sched_smt_active())
static_branch_enable(&cpu_buf_idle_clear);
else
static_branch_disable(&cpu_buf_idle_clear);
break;
case TSA_MITIGATION_NONE:
case TSA_MITIGATION_UCODE_NEEDED:
break;
}
switch (vmscape_mitigation) {
case VMSCAPE_MITIGATION_NONE:
case VMSCAPE_MITIGATION_AUTO:
break;
case VMSCAPE_MITIGATION_IBPB_ON_VMEXIT:
case VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER:
/*
* Hypervisors can be attacked across-threads, warn for SMT when
* STIBP is not already enabled system-wide.
*
* Intel eIBRS (!AUTOIBRS) implies STIBP on.
*/
if (!sched_smt_active() ||
spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT ||
spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED ||
(spectre_v2_in_eibrs_mode(spectre_v2_enabled) &&
!boot_cpu_has(X86_FEATURE_AUTOIBRS)))
break;
pr_warn_once(VMSCAPE_MSG_SMT);
break;
}
mutex_unlock(&spec_ctrl_mutex);
}
#ifdef CONFIG_SYSFS
#define L1TF_DEFAULT_MSG "Mitigation: PTE Inversion"
@ -3578,6 +3686,11 @@ static ssize_t tsa_show_state(char *buf)
return sysfs_emit(buf, "%s\n", tsa_strings[tsa_mitigation]);
}
static ssize_t vmscape_show_state(char *buf)
{
return sysfs_emit(buf, "%s\n", vmscape_strings[vmscape_mitigation]);
}
static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
char *buf, unsigned int bug)
{
@ -3644,6 +3757,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
case X86_BUG_TSA:
return tsa_show_state(buf);
case X86_BUG_VMSCAPE:
return vmscape_show_state(buf);
default:
break;
}
@ -3735,6 +3851,11 @@ ssize_t cpu_show_tsa(struct device *dev, struct device_attribute *attr, char *bu
{
return cpu_show_common(dev, attr, buf, X86_BUG_TSA);
}
ssize_t cpu_show_vmscape(struct device *dev, struct device_attribute *attr, char *buf)
{
return cpu_show_common(dev, attr, buf, X86_BUG_VMSCAPE);
}
#endif
void __warn_thunk(void)

View file

@ -1236,55 +1236,71 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
#define ITS_NATIVE_ONLY BIT(9)
/* CPU is affected by Transient Scheduler Attacks */
#define TSA BIT(10)
/* CPU is affected by VMSCAPE */
#define VMSCAPE BIT(11)
static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
VULNBL_INTEL_STEPS(INTEL_IVYBRIDGE, X86_STEP_MAX, SRBDS),
VULNBL_INTEL_STEPS(INTEL_HASWELL, X86_STEP_MAX, SRBDS),
VULNBL_INTEL_STEPS(INTEL_HASWELL_L, X86_STEP_MAX, SRBDS),
VULNBL_INTEL_STEPS(INTEL_HASWELL_G, X86_STEP_MAX, SRBDS),
VULNBL_INTEL_STEPS(INTEL_HASWELL_X, X86_STEP_MAX, MMIO),
VULNBL_INTEL_STEPS(INTEL_BROADWELL_D, X86_STEP_MAX, MMIO),
VULNBL_INTEL_STEPS(INTEL_BROADWELL_G, X86_STEP_MAX, SRBDS),
VULNBL_INTEL_STEPS(INTEL_BROADWELL_X, X86_STEP_MAX, MMIO),
VULNBL_INTEL_STEPS(INTEL_BROADWELL, X86_STEP_MAX, SRBDS),
VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, 0x5, MMIO | RETBLEED | GDS),
VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, X86_STEP_MAX, MMIO | RETBLEED | GDS | ITS),
VULNBL_INTEL_STEPS(INTEL_SKYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS),
VULNBL_INTEL_STEPS(INTEL_SKYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS),
VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, 0xb, MMIO | RETBLEED | GDS | SRBDS),
VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS),
VULNBL_INTEL_STEPS(INTEL_KABYLAKE, 0xc, MMIO | RETBLEED | GDS | SRBDS),
VULNBL_INTEL_STEPS(INTEL_KABYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS),
VULNBL_INTEL_STEPS(INTEL_CANNONLAKE_L, X86_STEP_MAX, RETBLEED),
VULNBL_INTEL_STEPS(INTEL_SANDYBRIDGE_X, X86_STEP_MAX, VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_SANDYBRIDGE, X86_STEP_MAX, VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_IVYBRIDGE_X, X86_STEP_MAX, VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_IVYBRIDGE, X86_STEP_MAX, SRBDS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_HASWELL, X86_STEP_MAX, SRBDS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_HASWELL_L, X86_STEP_MAX, SRBDS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_HASWELL_G, X86_STEP_MAX, SRBDS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_HASWELL_X, X86_STEP_MAX, MMIO | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_BROADWELL_D, X86_STEP_MAX, MMIO | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_BROADWELL_X, X86_STEP_MAX, MMIO | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_BROADWELL_G, X86_STEP_MAX, SRBDS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_BROADWELL, X86_STEP_MAX, SRBDS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, 0x5, MMIO | RETBLEED | GDS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_SKYLAKE_X, X86_STEP_MAX, MMIO | RETBLEED | GDS | ITS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_SKYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_SKYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, 0xb, MMIO | RETBLEED | GDS | SRBDS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_KABYLAKE_L, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_KABYLAKE, 0xc, MMIO | RETBLEED | GDS | SRBDS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_KABYLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | SRBDS | ITS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_CANNONLAKE_L, X86_STEP_MAX, RETBLEED | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_ICELAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS | ITS_NATIVE_ONLY),
VULNBL_INTEL_STEPS(INTEL_ICELAKE_D, X86_STEP_MAX, MMIO | GDS | ITS | ITS_NATIVE_ONLY),
VULNBL_INTEL_STEPS(INTEL_ICELAKE_X, X86_STEP_MAX, MMIO | GDS | ITS | ITS_NATIVE_ONLY),
VULNBL_INTEL_STEPS(INTEL_COMETLAKE, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS),
VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, 0x0, MMIO | RETBLEED | ITS),
VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS),
VULNBL_INTEL_STEPS(INTEL_COMETLAKE, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, 0x0, MMIO | RETBLEED | ITS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_COMETLAKE_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED | GDS | ITS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_TIGERLAKE_L, X86_STEP_MAX, GDS | ITS | ITS_NATIVE_ONLY),
VULNBL_INTEL_STEPS(INTEL_TIGERLAKE, X86_STEP_MAX, GDS | ITS | ITS_NATIVE_ONLY),
VULNBL_INTEL_STEPS(INTEL_LAKEFIELD, X86_STEP_MAX, MMIO | MMIO_SBDS | RETBLEED),
VULNBL_INTEL_STEPS(INTEL_ROCKETLAKE, X86_STEP_MAX, MMIO | RETBLEED | GDS | ITS | ITS_NATIVE_ONLY),
VULNBL_INTEL_TYPE(INTEL_ALDERLAKE, ATOM, RFDS),
VULNBL_INTEL_STEPS(INTEL_ALDERLAKE_L, X86_STEP_MAX, RFDS),
VULNBL_INTEL_TYPE(INTEL_RAPTORLAKE, ATOM, RFDS),
VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE_P, X86_STEP_MAX, RFDS),
VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE_S, X86_STEP_MAX, RFDS),
VULNBL_INTEL_STEPS(INTEL_ATOM_GRACEMONT, X86_STEP_MAX, RFDS),
VULNBL_INTEL_TYPE(INTEL_ALDERLAKE, ATOM, RFDS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_ALDERLAKE, X86_STEP_MAX, VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_ALDERLAKE_L, X86_STEP_MAX, RFDS | VMSCAPE),
VULNBL_INTEL_TYPE(INTEL_RAPTORLAKE, ATOM, RFDS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE, X86_STEP_MAX, VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE_P, X86_STEP_MAX, RFDS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_RAPTORLAKE_S, X86_STEP_MAX, RFDS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_METEORLAKE_L, X86_STEP_MAX, VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_ARROWLAKE_H, X86_STEP_MAX, VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_ARROWLAKE, X86_STEP_MAX, VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_ARROWLAKE_U, X86_STEP_MAX, VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_LUNARLAKE_M, X86_STEP_MAX, VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_SAPPHIRERAPIDS_X, X86_STEP_MAX, VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_GRANITERAPIDS_X, X86_STEP_MAX, VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_EMERALDRAPIDS_X, X86_STEP_MAX, VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_ATOM_GRACEMONT, X86_STEP_MAX, RFDS | VMSCAPE),
VULNBL_INTEL_STEPS(INTEL_ATOM_TREMONT, X86_STEP_MAX, MMIO | MMIO_SBDS | RFDS),
VULNBL_INTEL_STEPS(INTEL_ATOM_TREMONT_D, X86_STEP_MAX, MMIO | RFDS),
VULNBL_INTEL_STEPS(INTEL_ATOM_TREMONT_L, X86_STEP_MAX, MMIO | MMIO_SBDS | RFDS),
VULNBL_INTEL_STEPS(INTEL_ATOM_GOLDMONT, X86_STEP_MAX, RFDS),
VULNBL_INTEL_STEPS(INTEL_ATOM_GOLDMONT_D, X86_STEP_MAX, RFDS),
VULNBL_INTEL_STEPS(INTEL_ATOM_GOLDMONT_PLUS, X86_STEP_MAX, RFDS),
VULNBL_INTEL_STEPS(INTEL_ATOM_CRESTMONT_X, X86_STEP_MAX, VMSCAPE),
VULNBL_AMD(0x15, RETBLEED),
VULNBL_AMD(0x16, RETBLEED),
VULNBL_AMD(0x17, RETBLEED | SMT_RSB | SRSO),
VULNBL_HYGON(0x18, RETBLEED | SMT_RSB | SRSO),
VULNBL_AMD(0x19, SRSO | TSA),
VULNBL_AMD(0x1a, SRSO),
VULNBL_AMD(0x17, RETBLEED | SMT_RSB | SRSO | VMSCAPE),
VULNBL_HYGON(0x18, RETBLEED | SMT_RSB | SRSO | VMSCAPE),
VULNBL_AMD(0x19, SRSO | TSA | VMSCAPE),
VULNBL_AMD(0x1a, SRSO | VMSCAPE),
{}
};
@ -1543,6 +1559,14 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
}
}
/*
* Set the bug only on bare-metal. A nested hypervisor should already be
* deploying IBPB to isolate itself from nested guests.
*/
if (cpu_matches(cpu_vuln_blacklist, VMSCAPE) &&
!boot_cpu_has(X86_FEATURE_HYPERVISOR))
setup_force_cpu_bug(X86_BUG_VMSCAPE);
if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN))
return;

View file

@ -11010,6 +11010,15 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
if (vcpu->arch.guest_fpu.xfd_err)
wrmsrq(MSR_IA32_XFD_ERR, 0);
/*
* Mark this CPU as needing a branch predictor flush before running
* userspace. Must be done before enabling preemption to ensure it gets
* set for the CPU that actually ran the guest, and not the CPU that it
* may migrate to.
*/
if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER))
this_cpu_write(x86_ibpb_exit_to_user, true);
/*
* Consume any pending interrupts, including the possible source of
* VM-Exit on SVM and any ticks that occur between VM-Exit and now.

View file

@ -603,6 +603,7 @@ CPU_SHOW_VULN_FALLBACK(ghostwrite);
CPU_SHOW_VULN_FALLBACK(old_microcode);
CPU_SHOW_VULN_FALLBACK(indirect_target_selection);
CPU_SHOW_VULN_FALLBACK(tsa);
CPU_SHOW_VULN_FALLBACK(vmscape);
static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
@ -622,6 +623,7 @@ static DEVICE_ATTR(ghostwrite, 0444, cpu_show_ghostwrite, NULL);
static DEVICE_ATTR(old_microcode, 0444, cpu_show_old_microcode, NULL);
static DEVICE_ATTR(indirect_target_selection, 0444, cpu_show_indirect_target_selection, NULL);
static DEVICE_ATTR(tsa, 0444, cpu_show_tsa, NULL);
static DEVICE_ATTR(vmscape, 0444, cpu_show_vmscape, NULL);
static struct attribute *cpu_root_vulnerabilities_attrs[] = {
&dev_attr_meltdown.attr,
@ -642,6 +644,7 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = {
&dev_attr_old_microcode.attr,
&dev_attr_indirect_target_selection.attr,
&dev_attr_tsa.attr,
&dev_attr_vmscape.attr,
NULL
};

View file

@ -83,6 +83,7 @@ extern ssize_t cpu_show_old_microcode(struct device *dev,
extern ssize_t cpu_show_indirect_target_selection(struct device *dev,
struct device_attribute *attr, char *buf);
extern ssize_t cpu_show_tsa(struct device *dev, struct device_attribute *attr, char *buf);
extern ssize_t cpu_show_vmscape(struct device *dev, struct device_attribute *attr, char *buf);
extern __printf(4, 5)
struct device *cpu_device_create(struct device *parent, void *drvdata,