linux/tools/testing/selftests/kvm/lib/x86/ucall.c

// SPDX-License-Identifier: GPL-2.0
/*
 * ucall support. A ucall is a "hypercall to userspace".
 *
 * Copyright (C) 2018, Red Hat, Inc.
 */
#include "kvm_util.h"

#define UCALL_PIO_PORT ((uint16_t)0x1000)

void ucall_arch_do_ucall(vm_vaddr_t uc)
{
	/*
	 * FIXME: Revert this hack (the entire commit that added it) once nVMX
	 * preserves L2 GPRs across a nested VM-Exit.  If a ucall from L2, e.g.
	 * to do a GUEST_SYNC(), lands the vCPU in L1, any and all GPRs can be
	 * clobbered by L1.  Save and restore non-volatile GPRs (clobbering RBP
	 * in particular is problematic) along with RDX and RDI (which are
	 * inputs), and clobber volatile GPRs. *sigh*
	 */
#define HORRIFIC_L2_UCALL_CLOBBER_HACK	\
	"rcx", "rsi", "r8", "r9", "r10", "r11"

	asm volatile("push %%rbp\n\t"
		     "push %%r15\n\t"
		     "push %%r14\n\t"
		     "push %%r13\n\t"
		     "push %%r12\n\t"
		     "push %%rbx\n\t"
		     "push %%rdx\n\t"
		     "push %%rdi\n\t"
		     "in %[port], %%al\n\t"
		     "pop %%rdi\n\t"
		     "pop %%rdx\n\t"
		     "pop %%rbx\n\t"
		     "pop %%r12\n\t"
		     "pop %%r13\n\t"
		     "pop %%r14\n\t"
		     "pop %%r15\n\t"
		     "pop %%rbp\n\t"
		: : [port] "d" (UCALL_PIO_PORT), "D" (uc) : "rax", "memory",
		     HORRIFIC_L2_UCALL_CLOBBER_HACK);
}

void *ucall_arch_get_ucall(struct kvm_vcpu *vcpu)
{
	struct kvm_run *run = vcpu->run;

	if (run->exit_reason == KVM_EXIT_IO && run->io.port == UCALL_PIO_PORT) {
		struct kvm_regs regs;

		vcpu_regs_get(vcpu, &regs);
		return (void *)regs.rdi;
	}
	return NULL;
}
KVM: selftests: Split ucall.c into architecture specific files The way we exit from a guest to userspace is very specific to the architecture: On x86, we use PIO, on aarch64 we are using MMIO and on s390x we're going to use an instruction instead. The possibility to select a type via the ucall_type_t enum is currently also completely unused, so the code in ucall.c currently looks more complex than required. Let's split this up into architecture specific ucall.c files instead, so we can get rid of the #ifdefs and the unnecessary ucall_type_t handling. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20190731151525.17156-2-thuth@redhat.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> 2019-07-31 17:15:23 +02:00			`// SPDX-License-Identifier: GPL-2.0`
			`/*`
			`* ucall support. A ucall is a "hypercall to userspace".`
			`*`
			`* Copyright (C) 2018, Red Hat, Inc.`
			`*/`
			`#include "kvm_util.h"`

			`#define UCALL_PIO_PORT ((uint16_t)0x1000)`

KVM: selftests: Consolidate common code for populating ucall struct Make ucall() a common helper that populates struct ucall, and only calls into arch code to make the actually call out to userspace. Rename all arch-specific helpers to make it clear they're arch-specific, and to avoid collisions with common helpers (one more on its way...) Add WRITE_ONCE() to stores in ucall() code (as already done to aarch64 code in commit 9e2f6498efbb ("selftests: KVM: Handle compiler optimizations in ucall")) to prevent clang optimizations breaking ucalls. Cc: Colton Lewis <coltonlewis@google.com> Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Tested-by: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-2-seanjc@google.com 2022-10-06 00:34:03 +00:00			`void ucall_arch_do_ucall(vm_vaddr_t uc)`
KVM: selftests: Split ucall.c into architecture specific files The way we exit from a guest to userspace is very specific to the architecture: On x86, we use PIO, on aarch64 we are using MMIO and on s390x we're going to use an instruction instead. The possibility to select a type via the ucall_type_t enum is currently also completely unused, so the code in ucall.c currently looks more complex than required. Let's split this up into architecture specific ucall.c files instead, so we can get rid of the #ifdefs and the unnecessary ucall_type_t handling. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20190731151525.17156-2-thuth@redhat.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> 2019-07-31 17:15:23 +02:00			`{`
KVM: selftests: Add a shameful hack to preserve/clobber GPRs across ucall Preserve or clobber all GPRs (except RIP and RSP, as they're saved and restored via the VMCS) when performing a ucall on x86 to fudge around a horrific long-standing bug in selftests' nested VMX support where L2's GPRs are not preserved across a nested VM-Exit. I.e. if a test triggers a nested VM-Exit to L1 in response to a ucall, e.g. GUEST_SYNC(), then L2's GPR state can be corrupted. The issues manifests as an unexpected #GP in clear_bit() when running the hyperv_evmcs test due to RBX being used to track the ucall object, and RBX being clobbered by the nested VM-Exit. The problematic hyperv_evmcs testcase is where L0 (test's host userspace) injects an NMI in response to GUEST_SYNC(8) from L2, but the bug could "randomly" manifest in any test that induces a nested VM-Exit from L0. The bug hasn't caused failures in the past due to sheer dumb luck. The obvious fix is to rework the nVMX helpers to save/restore L2 GPRs across VM-Exit and VM-Enter, but that is a much bigger task and carries its own risks, e.g. nSVM does save/restore GPRs, but not in a thread-safe manner, and there is a _lot_ of cleanup that can be done to unify code for doing VM-Enter on nVMX, nSVM, and eVMCS. Link: https://lore.kernel.org/r/20230729003643.1053367-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> 2023-07-28 17:36:12 -07:00			`/*`
			`* FIXME: Revert this hack (the entire commit that added it) once nVMX`
			`* preserves L2 GPRs across a nested VM-Exit. If a ucall from L2, e.g.`
			`* to do a GUEST_SYNC(), lands the vCPU in L1, any and all GPRs can be`
			`* clobbered by L1. Save and restore non-volatile GPRs (clobbering RBP`
			`* in particular is problematic) along with RDX and RDI (which are`
			`* inputs), and clobber volatile GPRs. sigh`
			`*/`
			`#define HORRIFIC_L2_UCALL_CLOBBER_HACK \`
			`"rcx", "rsi", "r8", "r9", "r10", "r11"`

			`asm volatile("push %%rbp\n\t"`
			`"push %%r15\n\t"`
			`"push %%r14\n\t"`
			`"push %%r13\n\t"`
			`"push %%r12\n\t"`
			`"push %%rbx\n\t"`
			`"push %%rdx\n\t"`
			`"push %%rdi\n\t"`
			`"in %[port], %%al\n\t"`
			`"pop %%rdi\n\t"`
			`"pop %%rdx\n\t"`
			`"pop %%rbx\n\t"`
			`"pop %%r12\n\t"`
			`"pop %%r13\n\t"`
			`"pop %%r14\n\t"`
			`"pop %%r15\n\t"`
			`"pop %%rbp\n\t"`
			`: : [port] "d" (UCALL_PIO_PORT), "D" (uc) : "rax", "memory",`
			`HORRIFIC_L2_UCALL_CLOBBER_HACK);`
KVM: selftests: Split ucall.c into architecture specific files The way we exit from a guest to userspace is very specific to the architecture: On x86, we use PIO, on aarch64 we are using MMIO and on s390x we're going to use an instruction instead. The possibility to select a type via the ucall_type_t enum is currently also completely unused, so the code in ucall.c currently looks more complex than required. Let's split this up into architecture specific ucall.c files instead, so we can get rid of the #ifdefs and the unnecessary ucall_type_t handling. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20190731151525.17156-2-thuth@redhat.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> 2019-07-31 17:15:23 +02:00			`}`

KVM: selftests: Consolidate boilerplate code in get_ucall() Consolidate the actual copying of a ucall struct from guest=>host into the common get_ucall(). Return a host virtual address instead of a guest virtual address even though the addr_gva2hva() part could be moved to get_ucall() too. Conceptually, get_ucall() is invoked from the host and should return a host virtual address (and returning NULL for "nothing to see here" is far superior to returning 0). Use pointer shenanigans instead of an unnecessary bounce buffer when the caller of get_ucall() provides a valid pointer. Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Tested-by: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-3-seanjc@google.com 2022-10-06 00:34:04 +00:00			`void ucall_arch_get_ucall(struct kvm_vcpu vcpu)`
KVM: selftests: Split ucall.c into architecture specific files The way we exit from a guest to userspace is very specific to the architecture: On x86, we use PIO, on aarch64 we are using MMIO and on s390x we're going to use an instruction instead. The possibility to select a type via the ucall_type_t enum is currently also completely unused, so the code in ucall.c currently looks more complex than required. Let's split this up into architecture specific ucall.c files instead, so we can get rid of the #ifdefs and the unnecessary ucall_type_t handling. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20190731151525.17156-2-thuth@redhat.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> 2019-07-31 17:15:23 +02:00			`{`
KVM: selftests: Purge vm+vcpu_id == vcpu silliness Take a vCPU directly instead of a VM+vcpu pair in all vCPU-scoped helpers and ioctls. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> 2022-06-02 13:41:33 -07:00			`struct kvm_run *run = vcpu->run;`
selftests: kvm: Clear uc so UCALL_NONE is being properly reported Ensure the out value 'uc' in get_ucall() is properly reporting UCALL_NONE if the call fails. The return value will be correctly reported, however, the out parameter 'uc' will not be. Clear the struct to ensure the correct value is being reported in the out parameter. Signed-off-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Alexander Graf <graf@amazon.com> Message-Id: <20201012194716.3950330-3-aaronlewis@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> 2020-10-12 12:47:14 -07:00
KVM: selftests: Split ucall.c into architecture specific files The way we exit from a guest to userspace is very specific to the architecture: On x86, we use PIO, on aarch64 we are using MMIO and on s390x we're going to use an instruction instead. The possibility to select a type via the ucall_type_t enum is currently also completely unused, so the code in ucall.c currently looks more complex than required. Let's split this up into architecture specific ucall.c files instead, so we can get rid of the #ifdefs and the unnecessary ucall_type_t handling. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20190731151525.17156-2-thuth@redhat.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> 2019-07-31 17:15:23 +02:00			`if (run->exit_reason == KVM_EXIT_IO && run->io.port == UCALL_PIO_PORT) {`
			`struct kvm_regs regs;`

KVM: selftests: Purge vm+vcpu_id == vcpu silliness Take a vCPU directly instead of a VM+vcpu pair in all vCPU-scoped helpers and ioctls. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> 2022-06-02 13:41:33 -07:00			`vcpu_regs_get(vcpu, &regs);`
KVM: selftests: Add ucall pool based implementation To play nice with guests whose stack memory is encrypted, e.g. AMD SEV, introduce a new "ucall pool" implementation that passes the ucall struct via dedicated memory (which can be mapped shared, a.k.a. as plain text). Because not all architectures have access to the vCPU index in the guest, use a bitmap with atomic accesses to track which entries in the pool are free/used. A list+lock could also work in theory, but synchronizing the individual pointers to the guest would be a mess. Note, there's no need to rewalk the bitmap to ensure success. If all vCPUs are simply allocating, success is guaranteed because there are enough entries for all vCPUs. If one or more vCPUs are freeing and then reallocating, success is guaranteed because vCPUs _always_ walk the bitmap from 0=>N; if vCPU frees an entry and then wins a race to re-allocate, then either it will consume the entry it just freed (bit is the first free bit), or the losing vCPU is guaranteed to see the freed bit (winner consumes an earlier bit, which the loser hasn't yet visited). Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Signed-off-by: Peter Gonda <pgonda@google.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-8-seanjc@google.com 2022-10-06 00:34:09 +00:00			`return (void *)regs.rdi;`
KVM: selftests: Split ucall.c into architecture specific files The way we exit from a guest to userspace is very specific to the architecture: On x86, we use PIO, on aarch64 we are using MMIO and on s390x we're going to use an instruction instead. The possibility to select a type via the ucall_type_t enum is currently also completely unused, so the code in ucall.c currently looks more complex than required. Let's split this up into architecture specific ucall.c files instead, so we can get rid of the #ifdefs and the unnecessary ucall_type_t handling. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20190731151525.17156-2-thuth@redhat.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> 2019-07-31 17:15:23 +02:00			`}`
KVM: selftests: Consolidate boilerplate code in get_ucall() Consolidate the actual copying of a ucall struct from guest=>host into the common get_ucall(). Return a host virtual address instead of a guest virtual address even though the addr_gva2hva() part could be moved to get_ucall() too. Conceptually, get_ucall() is invoked from the host and should return a host virtual address (and returning NULL for "nothing to see here" is far superior to returning 0). Use pointer shenanigans instead of an unnecessary bounce buffer when the caller of get_ucall() provides a valid pointer. Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Tested-by: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-3-seanjc@google.com 2022-10-06 00:34:04 +00:00			`return NULL;`
KVM: selftests: Split ucall.c into architecture specific files The way we exit from a guest to userspace is very specific to the architecture: On x86, we use PIO, on aarch64 we are using MMIO and on s390x we're going to use an instruction instead. The possibility to select a type via the ucall_type_t enum is currently also completely unused, so the code in ucall.c currently looks more complex than required. Let's split this up into architecture specific ucall.c files instead, so we can get rid of the #ifdefs and the unnecessary ucall_type_t handling. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20190731151525.17156-2-thuth@redhat.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> 2019-07-31 17:15:23 +02:00			`}`