mirror of
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2025-08-05 16:54:27 +00:00

Document a flaw in KVM's ABI which lets userspace attempt to inject a "bad" hardware exception event, and thus induce VM-Fail on Intel CPUs. Fixing the flaw is a fool's errand, as AMD doesn't sanity check the validity of the error code, Intel CPUs that support CET relax the check for Protected Mode, userspace can change the mode after queueing an exception, KVM ignores the error code when emulating Real Mode exceptions, and so on and so forth. The VM-Fail itself doesn't harm KVM or the kernel beyond triggering a ratelimited pr_warn(), so just document the oddity. Link: https://lore.kernel.org/r/20240802200420.330769-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
80 lines
No EOL
3.2 KiB
ReStructuredText
80 lines
No EOL
3.2 KiB
ReStructuredText
.. SPDX-License-Identifier: GPL-2.0
|
|
|
|
=======================================
|
|
Known limitations of CPU virtualization
|
|
=======================================
|
|
|
|
Whenever perfect emulation of a CPU feature is impossible or too hard, KVM
|
|
has to choose between not implementing the feature at all or introducing
|
|
behavioral differences between virtual machines and bare metal systems.
|
|
|
|
This file documents some of the known limitations that KVM has in
|
|
virtualizing CPU features.
|
|
|
|
x86
|
|
===
|
|
|
|
``KVM_GET_SUPPORTED_CPUID`` issues
|
|
----------------------------------
|
|
|
|
x87 features
|
|
~~~~~~~~~~~~
|
|
|
|
Unlike most other CPUID feature bits, CPUID[EAX=7,ECX=0]:EBX[6]
|
|
(FDP_EXCPTN_ONLY) and CPUID[EAX=7,ECX=0]:EBX]13] (ZERO_FCS_FDS) are
|
|
clear if the features are present and set if the features are not present.
|
|
|
|
Clearing these bits in CPUID has no effect on the operation of the guest;
|
|
if these bits are set on hardware, the features will not be present on
|
|
any virtual machine that runs on that hardware.
|
|
|
|
**Workaround:** It is recommended to always set these bits in guest CPUID.
|
|
Note however that any software (e.g ``WIN87EM.DLL``) expecting these features
|
|
to be present likely predates these CPUID feature bits, and therefore
|
|
doesn't know to check for them anyway.
|
|
|
|
``KVM_SET_VCPU_EVENTS`` issue
|
|
-----------------------------
|
|
|
|
Invalid KVM_SET_VCPU_EVENTS input with respect to error codes *may* result in
|
|
failed VM-Entry on Intel CPUs. Pre-CET Intel CPUs require that exception
|
|
injection through the VMCS correctly set the "error code valid" flag, e.g.
|
|
require the flag be set when injecting a #GP, clear when injecting a #UD,
|
|
clear when injecting a soft exception, etc. Intel CPUs that enumerate
|
|
IA32_VMX_BASIC[56] as '1' relax VMX's consistency checks, and AMD CPUs have no
|
|
restrictions whatsoever. KVM_SET_VCPU_EVENTS doesn't sanity check the vector
|
|
versus "has_error_code", i.e. KVM's ABI follows AMD behavior.
|
|
|
|
Nested virtualization features
|
|
------------------------------
|
|
|
|
TBD
|
|
|
|
x2APIC
|
|
------
|
|
When KVM_X2APIC_API_USE_32BIT_IDS is enabled, KVM activates a hack/quirk that
|
|
allows sending events to a single vCPU using its x2APIC ID even if the target
|
|
vCPU has legacy xAPIC enabled, e.g. to bring up hotplugged vCPUs via INIT-SIPI
|
|
on VMs with > 255 vCPUs. A side effect of the quirk is that, if multiple vCPUs
|
|
have the same physical APIC ID, KVM will deliver events targeting that APIC ID
|
|
only to the vCPU with the lowest vCPU ID. If KVM_X2APIC_API_USE_32BIT_IDS is
|
|
not enabled, KVM follows x86 architecture when processing interrupts (all vCPUs
|
|
matching the target APIC ID receive the interrupt).
|
|
|
|
MTRRs
|
|
-----
|
|
KVM does not virtualize guest MTRR memory types. KVM emulates accesses to MTRR
|
|
MSRs, i.e. {RD,WR}MSR in the guest will behave as expected, but KVM does not
|
|
honor guest MTRRs when determining the effective memory type, and instead
|
|
treats all of guest memory as having Writeback (WB) MTRRs.
|
|
|
|
CR0.CD
|
|
------
|
|
KVM does not virtualize CR0.CD on Intel CPUs. Similar to MTRR MSRs, KVM
|
|
emulates CR0.CD accesses so that loads and stores from/to CR0 behave as
|
|
expected, but setting CR0.CD=1 has no impact on the cachaeability of guest
|
|
memory.
|
|
|
|
Note, this erratum does not affect AMD CPUs, which fully virtualize CR0.CD in
|
|
hardware, i.e. put the CPU caches into "no fill" mode when CR0.CD=1, even when
|
|
running in the guest. |