linux

mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-18 22:14:16 +00:00

Author	SHA1	Message	Date
Paolo Bonzini	86014c1e20	KVM generic changes for 6.11 - Enable halt poll shrinking by default, as Intel found it to be a clear win. - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86. - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out(). - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs. - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout. - A few minor cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmaRuOYACgkQOlYIJqCj N/1UnQ/8CI5Qfr+/0gzYgtWmtEMczGG+rMNpzD3XVqPjJjXcMcBiQnplnzUVLhha vlPdYVK7vgmEt003XGzV55mik46LHL+DX/v4hI3HEdblfyCeNLW3fKEWVRB44qJe o+YUQwSK42SORUp9oXuQINxhA//U9EnI7CQxlJ8w8wenv5IJKfIGr01DefmfGPAV PKm9t6WLcNqvhZMEyy/zmzM3KVPCJL0NcwI97x6sHxFpQYIDtL0E/VexA4AFqMoT QK7cSDC/2US41Zvem/r/GzM/ucdF6vb9suzZYBohwhxtVhwJe2CDeYQZvtNKJ1U7 GOHPaKL6nBWdZCm/yyWbbX2nstY1lHqxhN3JD0X8wqU5rNcwm2b8Vfyav0Ehc7H+ jVbDTshOx4YJmIgajoKjgM050rdBK59TdfVL+l+AAV5q/TlHocalYtvkEBdGmIDg 2td9UHSime6sp20vQfczUEz4bgrQsh4l2Fa/qU2jFwLievnBw0AvEaMximkSGMJe b8XfjmdTjlOesWAejANKtQolfrq14+1wYw0zZZ8PA+uNVpKdoovmcqSOcaDC9bT8 GO/NFUvoG+lkcvJcIlo1SSl81SmGLosijwxWfGvFAqsgpR3/3l3dYp0QtztoCNJO d3+HnjgYn5o5FwufuTD3eUOXH4AFjG108DH0o25XrIkb2Kymy0o= =BalU -----END PGP SIGNATURE----- Merge tag 'kvm-x86-generic-6.11' of https://github.com/kvm-x86/linux into HEAD KVM generic changes for 6.11 - Enable halt poll shrinking by default, as Intel found it to be a clear win. - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86. - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out(). - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs. - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout. - A few minor cleanups	2024-07-16 09:51:36 -04:00
Paolo Bonzini	1c5a0b55ab	KVM/arm64 changes for 6.11 - Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement - Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol - FPSIMD/SVE support for nested, including merged trap configuration and exception routing - New command-line parameter to control the WFx trap behavior under KVM - Introduce kCFI hardening in the EL2 hypervisor - Fixes + cleanups for handling presence/absence of FEAT_TCRX - Miscellaneous fixes + documentation updates -----BEGIN PGP SIGNATURE----- iI0EABYIADUWIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCZpTCAxccb2xpdmVyLnVw dG9uQGxpbnV4LmRldgAKCRCivnWIJHzdFjChAQCWs9ucJag4USgvXpg5mo9sxzly kBZZ1o49N/VLxs4cagEAtq3KVNQNQyGXelYH6gr20aI85j6VnZW5W5z+sy5TAgk= =sSOt -----END PGP SIGNATURE----- Merge tag 'kvmarm-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 changes for 6.11 - Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement - Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol - FPSIMD/SVE support for nested, including merged trap configuration and exception routing - New command-line parameter to control the WFx trap behavior under KVM - Introduce kCFI hardening in the EL2 hypervisor - Fixes + cleanups for handling presence/absence of FEAT_TCRX - Miscellaneous fixes + documentation updates	2024-07-16 09:50:44 -04:00
Oliver Upton	bc2e3253ca	Merge branch kvm-arm64/nv-tcr2 into kvmarm/next * kvm-arm64/nv-tcr2: : Fixes to the handling of TCR_EL1, courtesy of Marc Zyngier : : Series addresses a couple gaps that are present in KVM (from cover : letter): : : - VM configuration: HCRX_EL2.TCR2En is forced to 1, and we blindly : save/restore stuff. : : - trap bit description and routing: none, obviously, since we make a : point in not trapping. KVM: arm64: Honor trap routing for TCR2_EL1 KVM: arm64: Make PIR{,E0}_EL1 save/restore conditional on FEAT_TCRX KVM: arm64: Make TCR2_EL1 save/restore dependent on the VM features KVM: arm64: Get rid of HCRX_GUEST_FLAGS KVM: arm64: Correctly honor the presence of FEAT_TCRX Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-07-14 00:28:37 +00:00
Oliver Upton	8c2899e770	Merge branch kvm-arm64/nv-sve into kvmarm/next * kvm-arm64/nv-sve: : CPTR_EL2, FPSIMD/SVE support for nested : : This series brings support for honoring the guest hypervisor's CPTR_EL2 : trap configuration when running a nested guest, along with support for : FPSIMD/SVE usage at L1 and L2. KVM: arm64: Allow the use of SVE+NV KVM: arm64: nv: Add additional trap setup for CPTR_EL2 KVM: arm64: nv: Add trap description for CPTR_EL2 KVM: arm64: nv: Add TCPAC/TTA to CPTR->CPACR conversion helper KVM: arm64: nv: Honor guest hypervisor's FP/SVE traps in CPTR_EL2 KVM: arm64: nv: Load guest FP state for ZCR_EL2 trap KVM: arm64: nv: Handle CPACR_EL1 traps KVM: arm64: Spin off helper for programming CPTR traps KVM: arm64: nv: Ensure correct VL is loaded before saving SVE state KVM: arm64: nv: Use guest hypervisor's max VL when running nested guest KVM: arm64: nv: Save guest's ZCR_EL2 when in hyp context KVM: arm64: nv: Load guest hyp's ZCR into EL1 state KVM: arm64: nv: Handle ZCR_EL2 traps KVM: arm64: nv: Forward SVE traps to guest hypervisor KVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisor Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-07-14 00:27:06 +00:00
Oliver Upton	1270dad310	Merge branch kvm-arm64/el2-kcfi into kvmarm/next * kvm-arm64/el2-kcfi: : kCFI support in the EL2 hypervisor, courtesy of Pierre-Clément Tosi : : Enable the usage fo CONFIG_CFI_CLANG (kCFI) for hardening indirect : branches in the EL2 hypervisor. Unlike kernel support for the feature, : CFI failures at EL2 are always fatal. KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2 KVM: arm64: Introduce print_nvhe_hyp_panic helper arm64: Introduce esr_brk_comment, esr_is_cfi_brk KVM: arm64: VHE: Mark __hyp_call_panic __noreturn KVM: arm64: nVHE: gen-hyprel: Skip R_AARCH64_ABS32 KVM: arm64: nVHE: Simplify invalid_host_el2_vect KVM: arm64: Fix __pkvm_init_switch_pgd call ABI KVM: arm64: Fix clobbered ELR in sync abort/SError Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-07-14 00:23:32 +00:00
Oliver Upton	377d0e5d77	Merge branch kvm-arm64/ctr-el0 into kvmarm/next * kvm-arm64/ctr-el0: : Support for user changes to CTR_EL0, courtesy of Sebastian Ott : : Allow userspace to change the guest-visible value of CTR_EL0 for a VM, : so long as the requested value represents a subset of features supported : by hardware. In other words, prevent the VMM from over-promising the : capabilities of hardware. : : Make this happen by fitting CTR_EL0 into the existing infrastructure for : feature ID registers. KVM: selftests: Assert that MPIDR_EL1 is unchanged across vCPU reset KVM: arm64: nv: Unfudge ID_AA64PFR0_EL1 masking KVM: selftests: arm64: Test writes to CTR_EL0 KVM: arm64: rename functions for invariant sys regs KVM: arm64: show writable masks for feature registers KVM: arm64: Treat CTR_EL0 as a VM feature ID register KVM: arm64: unify code to prepare traps KVM: arm64: nv: Use accessors for modifying ID registers KVM: arm64: Add helper for writing ID regs KVM: arm64: Use read-only helper for reading VM ID registers KVM: arm64: Make idregs debugfs iterator search sysreg table directly KVM: arm64: Get sys_reg encoding from descriptor in idregs_debug_show() Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-07-14 00:22:32 +00:00
Oliver Upton	435a9f60ed	Merge branch kvm-arm64/shadow-mmu into kvmarm/next * kvm-arm64/shadow-mmu: : Shadow stage-2 MMU support for NV, courtesy of Marc Zyngier : : Initial implementation of shadow stage-2 page tables to support a guest : hypervisor. In the author's words: : : So here's the 10000m (approximately 30000ft for those of you stuck : with the wrong units) view of what this is doing: : : - for each {VMID,VTTBR,VTCR} tuple the guest uses, we use a : separate shadow s2_mmu context. This context has its own "real" : VMID and a set of page tables that are the combination of the : guest's S2 and the host S2, built dynamically one fault at a time. : : - these shadow S2 contexts are ephemeral, and behave exactly as : TLBs. For all intent and purposes, they are TLBs, and we discard : them pretty often. : : - TLB invalidation takes three possible paths: : : * either this is an EL2 S1 invalidation, and we directly emulate : it as early as possible : : * or this is an EL1 S1 invalidation, and we need to apply it to : the shadow S2s (plural!) that match the VMID set by the L1 guest : : * or finally, this is affecting S2, and we need to teardown the : corresponding part of the shadow S2s, which invalidates the TLBs KVM: arm64: nv: Truely enable nXS TLBI operations KVM: arm64: nv: Add handling of NXS-flavoured TLBI operations KVM: arm64: nv: Add handling of range-based TLBI operations KVM: arm64: nv: Add handling of outer-shareable TLBI operations KVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like information KVM: arm64: nv: Tag shadow S2 entries with guest's leaf S2 level KVM: arm64: nv: Handle FEAT_TTL hinted TLB operations KVM: arm64: nv: Handle TLBI IPAS2E1{,IS} operations KVM: arm64: nv: Handle TLBI ALLE1{,IS} operations KVM: arm64: nv: Handle TLBI VMALLS12E1{,IS} operations KVM: arm64: nv: Handle TLB invalidation targeting L2 stage-1 KVM: arm64: nv: Handle EL2 Stage-1 TLB invalidation KVM: arm64: nv: Add Stage-1 EL2 invalidation primitives KVM: arm64: nv: Unmap/flush shadow stage 2 page tables KVM: arm64: nv: Handle shadow stage 2 page faults KVM: arm64: nv: Implement nested Stage-2 page table walk logic KVM: arm64: nv: Support multiple nested Stage-2 mmu structures Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-07-14 00:11:45 +00:00
Oliver Upton	a35d5b2032	Merge branch kvm-arm64/ffa-1p1 into kvmarm/next * kvm-arm64/ffa-1p1: : Improvements to the pKVM FF-A Proxy, courtesy of Sebastian Ene : : Various minor improvements to how host FF-A calls are proxied with the : TEE, along with support for v1.1 of the protocol. KVM: arm64: Use FF-A 1.1 with pKVM KVM: arm64: Update the identification range for the FF-A smcs KVM: arm64: Add support for FFA_PARTITION_INFO_GET KVM: arm64: Trap FFA_VERSION host call in pKVM Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-07-14 00:11:34 +00:00
Oliver Upton	cb52b5c8b8	Revert "KVM: arm64: nv: Fix RESx behaviour of disabled FGTs with negative polarity" This reverts commit `eb9d53d4a9`. As Marc pointed out on the list [*], this patch is wrong, and those who find themselves in the SOB chain should have their heads checked. Annoyingly, the architecture has some FGT trap bits that are negative (i.e. 0 implies trap), and there was some confusion how KVM handles this for nested guests. However, it is clear now that KVM honors the RES0-ness of FGT traps already, meaning traps for features never exposed to the guest hypervisor get handled at L0. As they should. Link: https://lore.kernel.org/kvmarm/86bk3c3uss.wl-maz@kernel.org/T/#mb9abb3dd79f6a4544a91cb35676bd637c3a5e836 Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-07-08 19:50:51 +00:00
Marc Zyngier	3cfde36df7	KVM: arm64: nv: Truely enable nXS TLBI operations Although we now have support for nXS-flavoured TLBI instructions, we still don't expose the feature to the guest thanks to a mixture of misleading comment and use of a bunch of magic values. Fix the comment and correctly express the masking of LS64, which is enough to expose nXS to the world. Not that anyone cares... Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240703154743.824824-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-07-03 22:46:14 +00:00
Marc Zyngier	91e9cc70b7	KVM: arm64: Honor trap routing for TCR2_EL1 TCR2_EL1 handling is missing the handling of its trap configuration: - HCRX_EL2.TCR2En must be handled in conjunction with HCR_EL2.{TVM,TRVM} - HFG{R,W}TR_EL2.TCR_EL1 does apply to TCR2_EL1 as well Without these two controls being implemented, it is impossible to correctly route TCR2_EL1 traps. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240625130042.259175-7-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-27 00:04:25 +00:00
Marc Zyngier	663abf04ee	KVM: arm64: Make PIR{,E0}_EL1 save/restore conditional on FEAT_TCRX As per the architecture, if FEAT_S1PIE is implemented, then FEAT_TCRX must be implemented as well. Take advantage of this to avoid checking for S1PIE when TCRX isn't implemented. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240625130042.259175-6-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-27 00:04:25 +00:00
Marc Zyngier	1b04fd4027	KVM: arm64: Make TCR2_EL1 save/restore dependent on the VM features As for other registers, save/restore of TCR2_EL1 should be gated on the feature being actually present. In the case of a nVHE hypervisor, it is perfectly fine to leave the host value in the register, as HCRX_EL2.TCREn==0 imposes that TCR2_EL1 is treated as 0. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240625130042.259175-4-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-27 00:04:25 +00:00
Marc Zyngier	a3ee9ce88b	KVM: arm64: Get rid of HCRX_GUEST_FLAGS HCRX_GUEST_FLAGS gives random KVM hackers the impression that they can stuff bits in this macro and unconditionally enable features in the guest. In general, this is wrong (we have been there with FEAT_MOPS, and again with FEAT_TCRX). Document that HCRX_EL2.SMPME is an exception rather than the rule, and get rid of HCRX_GUEST_FLAGS. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20240625130042.259175-3-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-27 00:04:25 +00:00
Marc Zyngier	9b58e665d6	KVM: arm64: Correctly honor the presence of FEAT_TCRX We currently blindly enable TCR2_EL1 use in a guest, irrespective of the feature set. This is obviously wrong, and we should actually honor the guest configuration and handle the possible trap resulting from the guest being buggy. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20240625130042.259175-2-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-27 00:04:25 +00:00
Oliver Upton	33d85a93c6	KVM: arm64: nv: Unfudge ID_AA64PFR0_EL1 masking Marc reports that L1 VMs aren't booting with the NV series applied to today's kvmarm/next. After bisecting the issue, it appears that `44241f34fa` ("KVM: arm64: nv: Use accessors for modifying ID registers") is to blame. Poking around at the issue a bit further, it'd appear that the value for ID_AA64PFR0_EL1 is complete garbage, as 'val' still contains the value we set ID_AA64ISAR1_EL1 to. Fix the read-modify-write pattern to actually use ID_AA64PFR0_EL1 as the starting point. Excuse me as I return to my shame cube. Reported-by: Marc Zyngier <maz@kernel.org> Fixes: `44241f34fa` ("KVM: arm64: nv: Use accessors for modifying ID registers") Acked-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240621224044.2465901-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-22 17:21:50 +00:00
Oliver Upton	f1ee914fb6	KVM: arm64: Allow the use of SVE+NV Allow SVE and NV to mix now that everything is in place to handle it correctly. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-16-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 19:04:49 +00:00
Marc Zyngier	cd931bd609	KVM: arm64: nv: Add additional trap setup for CPTR_EL2 We need to teach KVM a couple of new tricks. CPTR_EL2 and its VHE accessor CPACR_EL1 need to be handled specially: - CPACR_EL1 is trapped on VHE so that we can track the TCPAC and TTA bits - CPTR_EL2.{TCPAC,E0POE} are propagated from L1 to L2 Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-15-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 19:04:49 +00:00
Marc Zyngier	e19d533126	KVM: arm64: nv: Add trap description for CPTR_EL2 Add trap description for CPTR_EL2.{TCPAC,TAM,E0POE,TTA}. TTA is a bit annoying as it changes location depending on E2H. This forces us to add yet another "complex" trap condition. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-14-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 19:04:49 +00:00
Oliver Upton	5326303bb7	KVM: arm64: nv: Honor guest hypervisor's FP/SVE traps in CPTR_EL2 Start folding the guest hypervisor's FP/SVE traps into the value programmed in hardware. Note that as of writing this is dead code, since KVM does a full put() / load() for every nested exception boundary which saves + flushes the FP/SVE state. However, this will become useful when we can keep the guest's FP/SVE state alive across a nested exception boundary and the host no longer needs to conservatively program traps. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-12-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 19:04:49 +00:00
Oliver Upton	0cfc85b8f5	KVM: arm64: nv: Load guest FP state for ZCR_EL2 trap Round out the ZCR_EL2 gymnastics by loading SVE state in the fast path when the guest hypervisor tries to access SVE state. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-11-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 19:04:49 +00:00
Marc Zyngier	493da2b1c4	KVM: arm64: nv: Handle CPACR_EL1 traps Handle CPACR_EL1 accesses when running a VHE guest. In order to limit the cost of the emulation, implement it ass a shallow exit. In the other cases: - this is a nVHE L1 which will write to memory, and we don't trap - this is a L2 guest: * the L1 has CPTR_EL2.TCPAC==0, and the L2 has direct register access * the L1 has CPTR_EL2.TCPAC==1, and the L2 will trap, but the handling is defered to the general handling for forwarding Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-10-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 19:04:49 +00:00
Oliver Upton	1785f020b1	KVM: arm64: Spin off helper for programming CPTR traps A subsequent change to KVM will add preliminary support for merging a guest hypervisor's CPTR traps with that of KVM. Prepare by spinning off a new helper for managing CPTR traps. Avoid reading CPACR_EL1 for the baseline trap config, and start off with the most restrictive set of traps that is subsequently relaxed. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-9-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 19:04:10 +00:00
Oliver Upton	2e3cf82063	KVM: arm64: nv: Ensure correct VL is loaded before saving SVE state It is possible that the guest hypervisor has selected a smaller VL than the maximum for its nested guest. As such, ZCR_EL2 may be configured for a different VL when exiting a nested guest. Set ZCR_EL2 (via the EL1 alias) to the maximum VL for the VM before saving SVE state as the SVE save area is dimensioned by the max VL. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-8-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 19:02:40 +00:00
Oliver Upton	9092aca9fe	KVM: arm64: nv: Use guest hypervisor's max VL when running nested guest The max VL for nested guests is additionally constrained by the max VL selected by the guest hypervisor. Use that instead of KVM's max VL when running a nested guest. Note that the guest hypervisor's ZCR_EL2 is sanitised against the VM's max VL at the time of access, so there's no additional handling required at the time of use. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-7-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 19:02:40 +00:00
Oliver Upton	b7e5c94264	KVM: arm64: nv: Save guest's ZCR_EL2 when in hyp context When running a guest hypervisor, ZCR_EL2 is an alias for the counterpart EL1 state. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-6-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 19:02:40 +00:00
Oliver Upton	069da3ffda	KVM: arm64: nv: Load guest hyp's ZCR into EL1 state Load the guest hypervisor's ZCR_EL2 into the corresponding EL1 register when restoring SVE state, as ZCR_EL2 affects the VL in the hypervisor context. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-5-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 19:02:40 +00:00
Oliver Upton	b3d29a8230	KVM: arm64: nv: Handle ZCR_EL2 traps Unlike other SVE-related registers, ZCR_EL2 takes a sysreg trap to EL2 when HCR_EL2.NV = 1. KVM still needs to honor the guest hypervisor's trap configuration, which expects an SVE trap (i.e. ESR_EL2.EC = 0x19) when CPTR traps are enabled for the vCPU's current context. Otherwise, if the guest hypervisor has traps disabled, emulate the access by mapping the requested VL into ZCR_EL1. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-4-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 19:01:20 +00:00
Oliver Upton	399debfc97	KVM: arm64: nv: Forward SVE traps to guest hypervisor Similar to FPSIMD traps, don't load SVE state if the guest hypervisor has SVE traps enabled and forward the trap instead. Note that ZCR_EL2 will require some special handling, as it takes a sysreg trap to EL2 when HCR_EL2.NV = 1. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-3-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 19:01:20 +00:00
Jintack Lim	d2b2ecba8d	KVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisor Give precedence to the guest hypervisor's trap configuration when routing an FP/ASIMD trap taken to EL2. Take advantage of the infrastructure for translating CPTR_EL2 into the VHE (i.e. EL1) format and base the trap decision solely on the VHE view of the register. The in-memory value of CPTR_EL2 will always be up to date for the guest hypervisor (more on that later), so just read it directly from memory. Bury all of this behind a macro keyed off of the CPTR bitfield in anticipation of supporting other traps (e.g. SVE). [maz: account for HCR_EL2.E2H when testing for TFP/FPEN, with all the hard work actually being done by Chase Conklin] [ oliver: translate nVHE->VHE format for testing traps; macro for reuse in other CPTR_EL2.xEN fields ] Signed-off-by: Jintack Lim <jintack.lim@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-2-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 19:01:20 +00:00
Pierre-Clément Tosi	eca4ba5b6d	KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2 The compiler implements kCFI by adding type information (u32) above every function that might be indirectly called and, whenever a function pointer is called, injects a read-and-compare of that u32 against the value corresponding to the expected type. In case of a mismatch, a BRK instruction gets executed. When the hypervisor triggers such an exception in nVHE, it panics and triggers and exception return to EL1. Therefore, teach nvhe_hyp_panic_handler() to detect kCFI errors from the ESR and report them. If necessary, remind the user that EL2 kCFI is not affected by CONFIG_CFI_PERMISSIVE. Pass $(CC_FLAGS_CFI) to the compiler when building the nVHE hyp code. Use SYM_TYPED_FUNC_START() for __pkvm_init_switch_pgd, as nVHE can't call it directly and must use a PA function pointer from C (because it is part of the idmap page), which would trigger a kCFI failure if the type ID wasn't present. Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-9-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:40:54 +00:00
Pierre-Clément Tosi	8f3873a395	KVM: arm64: Introduce print_nvhe_hyp_panic helper Add a helper to display a panic banner soon to also be used for kCFI failures, to ensure that we remain consistent. Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-8-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:40:54 +00:00
Pierre-Clément Tosi	7a928b32f1	arm64: Introduce esr_brk_comment, esr_is_cfi_brk As it is already used in two places, move esr_comment() to a header for re-use, with a clearer name. Introduce esr_is_cfi_brk() to detect kCFI BRK syndromes, currently used by early_brk64() but soon to also be used by hypervisor code. Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-7-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:40:54 +00:00
Pierre-Clément Tosi	3c6eb64876	KVM: arm64: VHE: Mark __hyp_call_panic __noreturn Given that the sole purpose of __hyp_call_panic() is to call panic(), a __noreturn function, give it the __noreturn attribute, removing the need for its caller to use unreachable(). Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-6-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:40:54 +00:00
Pierre-Clément Tosi	4ab3f9dd56	KVM: arm64: nVHE: gen-hyprel: Skip R_AARCH64_ABS32 Ignore R_AARCH64_ABS32 relocations, instead of panicking, when emitting the relocation table of the hypervisor. The toolchain might produce them when generating function calls with kCFI to represent the 32-bit type ID which can then be resolved across compilation units at link time. These are NOT actual 32-bit addresses and are therefore not needed in the final (runtime) relocation table (which is unlikely to use 32-bit absolute addresses for arm64 anyway). Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-5-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:40:54 +00:00
Pierre-Clément Tosi	6e3b773ed6	KVM: arm64: nVHE: Simplify invalid_host_el2_vect The invalid_host_el2_vect macro is used by EL2{t,h} handlers in nVHE host context, which should never run with a guest context loaded. Therefore, remove the superfluous vCPU context check and branch unconditionally to hyp_panic. Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-4-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:40:54 +00:00
Pierre-Clément Tosi	ea9d7c83d1	KVM: arm64: Fix __pkvm_init_switch_pgd call ABI Fix the mismatch between the (incorrect) C signature, C call site, and asm implementation by aligning all three on an API passing the parameters (pgd and SP) separately, instead of as a bundled struct. Remove the now unnecessary memory accesses while the MMU is off from the asm, which simplifies the C caller (as it does not need to convert a VA struct pointer to PA) and makes the code slightly more robust by offsetting the struct fields from C and properly expressing the call to the C compiler (e.g. type checker and kCFI). Fixes: `f320bc742b` ("KVM: arm64: Prepare the creation of s1 mappings at EL2") Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-3-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:40:53 +00:00
Pierre-Clément Tosi	a8f0655887	KVM: arm64: Fix clobbered ELR in sync abort/SError When the hypervisor receives a SError or synchronous exception (EL2h) while running with the __kvm_hyp_vector and if ELR_EL2 doesn't point to an extable entry, it panics indirectly by overwriting ELR with the address of a panic handler in order for the asm routine it returns to to ERET into the handler. However, this clobbers ELR_EL2 for the handler itself. As a result, hyp_panic(), when retrieving what it believes to be the PC where the exception happened, actually ends up reading the address of the panic handler that called it! This results in an erroneous and confusing panic message where the source of any synchronous exception (e.g. BUG() or kCFI) appears to be __guest_exit_panic, making it hard to locate the actual BRK instruction. Therefore, store the original ELR_EL2 in the per-CPU kvm_hyp_ctxt and point the sysreg to a routine that first restores it to its previous value before running __guest_exit_panic. Fixes: `7db2153047` ("KVM: arm64: Restore hyp when panicking in guest context") Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-2-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:40:53 +00:00
Sebastian Ott	76d3601227	KVM: arm64: rename functions for invariant sys regs Invariant system id registers are populated with host values at initialization time using their .reset function cb. These are currently called get_* which is usually used by the functions implementing the .get_user callback. Change their function names to reset_* to reflect what they are used for. Signed-off-by: Sebastian Ott <sebott@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20240619174036.483943-10-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:16:45 +00:00
Sebastian Ott	bb4fa769dc	KVM: arm64: show writable masks for feature registers Instead of using ~0UL provide the actual writable mask for non-id feature registers in the output of the KVM_ARM_GET_REG_WRITABLE_MASKS ioctl. This changes the mask for the CTR_EL0 and CLIDR_EL1 registers. Signed-off-by: Sebastian Ott <sebott@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20240619174036.483943-9-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:16:45 +00:00
Sebastian Ott	2843cae266	KVM: arm64: Treat CTR_EL0 as a VM feature ID register CTR_EL0 is currently handled as an invariant register, thus guests will be presented with the host value of that register. Add emulation for CTR_EL0 based on a per VM value. Userspace can switch off DIC and IDC bits and reduce DminLine and IminLine sizes. Naturally, ensure CTR_EL0 is trapped (HCR_EL2.TID2=1) any time that a VM's CTR_EL0 differs from hardware. Signed-off-by: Sebastian Ott <sebott@redhat.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Link: https://lore.kernel.org/r/20240619174036.483943-8-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:16:44 +00:00
Sebastian Ott	f1ff3fc520	KVM: arm64: unify code to prepare traps There are 2 functions to calculate traps via HCR_EL2: * kvm_init_sysreg() called via KVM_RUN (before the 1st run or when the pid changes) * vcpu_reset_hcr() called via KVM_ARM_VCPU_INIT To unify these 2 and to support traps that are dependent on the ID register configuration, move the code from vcpu_reset_hcr() to sys_regs.c and call it via kvm_init_sysreg(). We still have to keep the non-FWB handling stuff in vcpu_reset_hcr(). Also the initialization with HCR_GUEST_FLAGS is kept there but guarded by !vcpu_has_run_once() to ensure that previous calculated values don't get overwritten. While at it rename kvm_init_sysreg() to kvm_calculate_traps() to better reflect what it's doing. Signed-off-by: Sebastian Ott <sebott@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20240619174036.483943-7-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:16:44 +00:00
Oliver Upton	44241f34fa	KVM: arm64: nv: Use accessors for modifying ID registers In the interest of abstracting away the underlying storage of feature ID registers, rework the nested code to go through the accessors instead of directly iterating the id_regs array. This means we now lose the property that ID registers unknown to the nested code get zeroed, but we really ought to be handling those explicitly going forward. Link: https://lore.kernel.org/r/20240619174036.483943-6-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:16:44 +00:00
Oliver Upton	d7508d27dd	KVM: arm64: Add helper for writing ID regs Replace the remaining usage of IDREG() with a new helper for setting the value of a feature ID register, with the benefit of cramming in some extra sanity checks. Reviewed-by: Sebastian Ott <sebott@redhat.com> Link: https://lore.kernel.org/r/20240619174036.483943-5-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:16:44 +00:00
Oliver Upton	97ca3fcc15	KVM: arm64: Use read-only helper for reading VM ID registers IDREG() expands to the storage of a particular ID reg, which can be useful for handling both reads and writes. However, outside of a select few situations, the ID registers should be considered read only. Replace current readers with a new macro that expands to the value of the field rather than the field itself. Reviewed-by: Sebastian Ott <sebott@redhat.com> Link: https://lore.kernel.org/r/20240619174036.483943-4-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:16:44 +00:00
Oliver Upton	410db103f6	KVM: arm64: Make idregs debugfs iterator search sysreg table directly CTR_EL0 complicates the existing scheme for iterating feature ID registers, as it is not in the contiguous range that we presently support. Just search the sysreg table for the Nth feature ID register in anticipation of this. Yes, the debugfs interface has quadratic time completixy now. Boo hoo. Reviewed-by: Sebastian Ott <sebott@redhat.com> Link: https://lore.kernel.org/r/20240619174036.483943-3-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:16:44 +00:00
Oliver Upton	4e8ff73eb7	KVM: arm64: Get sys_reg encoding from descriptor in idregs_debug_show() KVM is about to add support for more VM-scoped feature ID regs that live outside of the id_regs[] array, which means the index of the debugfs iterator may not actually be an index into the array. Prepare by getting the sys_reg encoding from the descriptor itself. Reviewed-by: Sebastian Ott <sebott@redhat.com> Link: https://lore.kernel.org/r/20240619174036.483943-2-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-20 17:16:44 +00:00
Oliver Upton	3dc14eefa5	KVM: arm64: nv: Use GFP_KERNEL_ACCOUNT for sysreg_masks allocation Of course, userspace is in the driver's seat for struct kvm and associated allocations. Make sure the sysreg_masks allocation participates in kmem accounting. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240617181018.2054332-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-19 08:42:39 +00:00
Marc Zyngier	0feec7769a	KVM: arm64: nv: Add handling of NXS-flavoured TLBI operations Latest kid on the block: NXS (Non-eXtra-Slow) TLBI operations. Let's add those in bulk (NSH, ISH, OSH, both normal and range) as they directly map to their XS (the standard ones) counterparts. Not a lot to say about them, they are basically useless. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240614144552.2773592-17-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-19 08:14:38 +00:00
Marc Zyngier	5d476ca57d	KVM: arm64: nv: Add handling of range-based TLBI operations We already support some form of range operation by handling FEAT_TTL, but so far the "arbitrary" range operations are unsupported. Let's fix that. For EL2 S1, this is simple enough: we just map both NSH, ISH and OSH instructions onto the ISH version for EL1. For TLBI instructions affecting EL1 S1, we use the same model as their non-range counterpart to invalidate in the context of the correct VMID. For TLBI instructions affecting S2, we interpret the data passed by the guest to compute the range and use that to tear-down part of the shadow S2 range and invalidate the TLBs. Finally, we advertise FEAT_TLBIRANGE if the host supports it. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240614144552.2773592-16-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-06-19 08:14:38 +00:00

1 2 3 4 5 ...

2530 commits