2019-06-03 07:44:50 +02:00
|
|
|
// SPDX-License-Identifier: GPL-2.0-only
|
2017-10-27 15:28:38 +01:00
|
|
|
/*
|
|
|
|
* Copyright (C) 2017 ARM Ltd.
|
|
|
|
* Author: Marc Zyngier <marc.zyngier@arm.com>
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <linux/interrupt.h>
|
2017-10-27 15:28:48 +01:00
|
|
|
#include <linux/irq.h>
|
2017-10-27 15:28:38 +01:00
|
|
|
#include <linux/irqdomain.h>
|
|
|
|
#include <linux/kvm_host.h>
|
2017-10-27 15:28:39 +01:00
|
|
|
#include <linux/irqchip/arm-gic-v3.h>
|
2017-10-27 15:28:38 +01:00
|
|
|
|
|
|
|
#include "vgic.h"
|
|
|
|
|
2017-10-27 15:28:55 +01:00
|
|
|
/*
|
|
|
|
* How KVM uses GICv4 (insert rude comments here):
|
|
|
|
*
|
|
|
|
* The vgic-v4 layer acts as a bridge between several entities:
|
|
|
|
* - The GICv4 ITS representation offered by the ITS driver
|
|
|
|
* - VFIO, which is in charge of the PCI endpoint
|
|
|
|
* - The virtual ITS, which is the only thing the guest sees
|
|
|
|
*
|
|
|
|
* The configuration of VLPIs is triggered by a callback from VFIO,
|
|
|
|
* instructing KVM that a PCI device has been configured to deliver
|
|
|
|
* MSIs to a vITS.
|
|
|
|
*
|
|
|
|
* kvm_vgic_v4_set_forwarding() is thus called with the routing entry,
|
|
|
|
* and this is used to find the corresponding vITS data structures
|
|
|
|
* (ITS instance, device, event and irq) using a process that is
|
|
|
|
* extremely similar to the injection of an MSI.
|
|
|
|
*
|
|
|
|
* At this stage, we can link the guest's view of an LPI (uniquely
|
|
|
|
* identified by the routing entry) and the host irq, using the GICv4
|
|
|
|
* driver mapping operation. Should the mapping succeed, we've then
|
|
|
|
* successfully upgraded the guest's LPI to a VLPI. We can then start
|
|
|
|
* with updating GICv4's view of the property table and generating an
|
|
|
|
* INValidation in order to kickstart the delivery of this VLPI to the
|
|
|
|
* guest directly, without software intervention. Well, almost.
|
|
|
|
*
|
|
|
|
* When the PCI endpoint is deconfigured, this operation is reversed
|
|
|
|
* with VFIO calling kvm_vgic_v4_unset_forwarding().
|
|
|
|
*
|
|
|
|
* Once the VLPI has been mapped, it needs to follow any change the
|
|
|
|
* guest performs on its LPI through the vITS. For that, a number of
|
|
|
|
* command handlers have hooks to communicate these changes to the HW:
|
|
|
|
* - Any invalidation triggers a call to its_prop_update_vlpi()
|
|
|
|
* - The INT command results in a irq_set_irqchip_state(), which
|
|
|
|
* generates an INT on the corresponding VLPI.
|
|
|
|
* - The CLEAR command results in a irq_set_irqchip_state(), which
|
|
|
|
* generates an CLEAR on the corresponding VLPI.
|
|
|
|
* - DISCARD translates into an unmap, similar to a call to
|
|
|
|
* kvm_vgic_v4_unset_forwarding().
|
|
|
|
* - MOVI is translated by an update of the existing mapping, changing
|
|
|
|
* the target vcpu, resulting in a VMOVI being generated.
|
|
|
|
* - MOVALL is translated by a string of mapping updates (similar to
|
|
|
|
* the handling of MOVI). MOVALL is horrible.
|
|
|
|
*
|
|
|
|
* Note that a DISCARD/MAPTI sequence emitted from the guest without
|
|
|
|
* reprogramming the PCI endpoint after MAPTI does not result in a
|
|
|
|
* VLPI being mapped, as there is no callback from VFIO (the guest
|
|
|
|
* will get the interrupt via the normal SW injection). Fixing this is
|
|
|
|
* not trivial, and requires some horrible messing with the VFIO
|
|
|
|
* internals. Not fun. Don't do that.
|
|
|
|
*
|
|
|
|
* Then there is the scheduling. Each time a vcpu is about to run on a
|
|
|
|
* physical CPU, KVM must tell the corresponding redistributor about
|
|
|
|
* it. And if we've migrated our vcpu from one CPU to another, we must
|
|
|
|
* tell the ITS (so that the messages reach the right redistributor).
|
|
|
|
* This is done in two steps: first issue a irq_set_affinity() on the
|
2020-03-04 20:33:20 +00:00
|
|
|
* irq corresponding to the vcpu, then call its_make_vpe_resident().
|
|
|
|
* You must be in a non-preemptible context. On exit, a call to
|
|
|
|
* its_make_vpe_non_resident() tells the redistributor that we're done
|
|
|
|
* with the vcpu.
|
2017-10-27 15:28:55 +01:00
|
|
|
*
|
|
|
|
* Finally, the doorbell handling: Each vcpu is allocated an interrupt
|
|
|
|
* which will fire each time a VLPI is made pending whilst the vcpu is
|
|
|
|
* not running. Each time the vcpu gets blocked, the doorbell
|
|
|
|
* interrupt gets enabled. When the vcpu is unblocked (for whatever
|
|
|
|
* reason), the doorbell interrupt is disabled.
|
|
|
|
*/
|
|
|
|
|
2017-10-27 15:28:53 +01:00
|
|
|
#define DB_IRQ_FLAGS (IRQ_NOAUTOEN | IRQ_DISABLE_UNLAZY | IRQ_NO_BALANCING)
|
|
|
|
|
2017-10-27 15:28:48 +01:00
|
|
|
static irqreturn_t vgic_v4_doorbell_handler(int irq, void *info)
|
|
|
|
{
|
|
|
|
struct kvm_vcpu *vcpu = info;
|
|
|
|
|
KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/put
When the VHE code was reworked, a lot of the vgic stuff was moved around,
but the GICv4 residency code did stay untouched, meaning that we come
in and out of residency on each flush/sync, which is obviously suboptimal.
To address this, let's move things around a bit:
- Residency entry (flush) moves to vcpu_load
- Residency exit (sync) moves to vcpu_put
- On blocking (entry to WFI), we "put"
- On unblocking (exit from WFI), we "load"
Because these can nest (load/block/put/load/unblock/put, for example),
we now have per-VPE tracking of the residency state.
Additionally, vgic_v4_put gains a "need doorbell" parameter, which only
gets set to true when blocking because of a WFI. This allows a finer
control of the doorbell, which now also gets disabled as soon as
it gets signaled.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20191027144234.8395-2-maz@kernel.org
2019-10-27 14:41:59 +00:00
|
|
|
/* We got the message, no need to fire again */
|
2020-03-04 20:33:20 +00:00
|
|
|
if (!kvm_vgic_global_state.has_gicv4_1 &&
|
|
|
|
!irqd_irq_disabled(&irq_to_desc(irq)->irq_data))
|
KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/put
When the VHE code was reworked, a lot of the vgic stuff was moved around,
but the GICv4 residency code did stay untouched, meaning that we come
in and out of residency on each flush/sync, which is obviously suboptimal.
To address this, let's move things around a bit:
- Residency entry (flush) moves to vcpu_load
- Residency exit (sync) moves to vcpu_put
- On blocking (entry to WFI), we "put"
- On unblocking (exit from WFI), we "load"
Because these can nest (load/block/put/load/unblock/put, for example),
we now have per-VPE tracking of the residency state.
Additionally, vgic_v4_put gains a "need doorbell" parameter, which only
gets set to true when blocking because of a WFI. This allows a finer
control of the doorbell, which now also gets disabled as soon as
it gets signaled.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20191027144234.8395-2-maz@kernel.org
2019-10-27 14:41:59 +00:00
|
|
|
disable_irq_nosync(irq);
|
|
|
|
|
2020-06-23 10:44:08 +01:00
|
|
|
/*
|
|
|
|
* The v4.1 doorbell can fire concurrently with the vPE being
|
|
|
|
* made non-resident. Ensure we only update pending_last
|
|
|
|
* *after* the non-residency sequence has completed.
|
|
|
|
*/
|
|
|
|
raw_spin_lock(&vcpu->arch.vgic_cpu.vgic_v3.its_vpe.vpe_lock);
|
2017-10-27 15:28:48 +01:00
|
|
|
vcpu->arch.vgic_cpu.vgic_v3.its_vpe.pending_last = true;
|
2020-06-23 10:44:08 +01:00
|
|
|
raw_spin_unlock(&vcpu->arch.vgic_cpu.vgic_v3.its_vpe.vpe_lock);
|
|
|
|
|
2017-10-27 15:28:48 +01:00
|
|
|
kvm_make_request(KVM_REQ_IRQ_PENDING, vcpu);
|
|
|
|
kvm_vcpu_kick(vcpu);
|
|
|
|
|
|
|
|
return IRQ_HANDLED;
|
|
|
|
}
|
|
|
|
|
2020-03-04 20:33:26 +00:00
|
|
|
static void vgic_v4_sync_sgi_config(struct its_vpe *vpe, struct vgic_irq *irq)
|
|
|
|
{
|
|
|
|
vpe->sgi_config[irq->intid].enabled = irq->enabled;
|
|
|
|
vpe->sgi_config[irq->intid].group = irq->group;
|
|
|
|
vpe->sgi_config[irq->intid].priority = irq->priority;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void vgic_v4_enable_vsgis(struct kvm_vcpu *vcpu)
|
|
|
|
{
|
|
|
|
struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* With GICv4.1, every virtual SGI can be directly injected. So
|
|
|
|
* let's pretend that they are HW interrupts, tied to a host
|
|
|
|
* IRQ. The SGI code will do its magic.
|
|
|
|
*/
|
|
|
|
for (i = 0; i < VGIC_NR_SGIS; i++) {
|
KVM: arm64: vgic: Make vgic_get_irq() more robust
vgic_get_irq() has an awkward signature, as it takes both a kvm
*and* a vcpu, where the vcpu is allowed to be NULL if the INTID
being looked up is a global interrupt (SPI or LPI).
This leads to potentially problematic situations where the INTID
passed is a private interrupt, but that there is no vcpu.
In order to make things less ambiguous, let have *two* helpers
instead:
- vgic_get_irq(struct kvm *kvm, u32 intid), which is only concerned
with *global* interrupts, as indicated by the lack of vcpu.
- vgic_get_vcpu_irq(struct kvm_vcpu *vcpu, u32 intid), which can
return *any* interrupt class, but must have of course a non-NULL
vcpu.
Most of the code nicely falls under one or the other situations,
except for a couple of cases (close to the UABI or in the debug code)
where we have to distinguish between the two cases.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241117165757.247686-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-17 16:57:55 +00:00
|
|
|
struct vgic_irq *irq = vgic_get_vcpu_irq(vcpu, i);
|
2020-03-04 20:33:26 +00:00
|
|
|
struct irq_desc *desc;
|
|
|
|
unsigned long flags;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
raw_spin_lock_irqsave(&irq->irq_lock, flags);
|
|
|
|
|
|
|
|
if (irq->hw)
|
|
|
|
goto unlock;
|
|
|
|
|
|
|
|
irq->hw = true;
|
|
|
|
irq->host_irq = irq_find_mapping(vpe->sgi_domain, i);
|
|
|
|
|
|
|
|
/* Transfer the full irq state to the vPE */
|
|
|
|
vgic_v4_sync_sgi_config(vpe, irq);
|
|
|
|
desc = irq_to_desc(irq->host_irq);
|
|
|
|
ret = irq_domain_activate_irq(irq_desc_get_irq_data(desc),
|
|
|
|
false);
|
|
|
|
if (!WARN_ON(ret)) {
|
|
|
|
/* Transfer pending state */
|
|
|
|
ret = irq_set_irqchip_state(irq->host_irq,
|
|
|
|
IRQCHIP_STATE_PENDING,
|
|
|
|
irq->pending_latch);
|
|
|
|
WARN_ON(ret);
|
|
|
|
irq->pending_latch = false;
|
|
|
|
}
|
|
|
|
unlock:
|
|
|
|
raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
|
|
|
|
vgic_put_irq(vcpu->kvm, irq);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void vgic_v4_disable_vsgis(struct kvm_vcpu *vcpu)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for (i = 0; i < VGIC_NR_SGIS; i++) {
|
KVM: arm64: vgic: Make vgic_get_irq() more robust
vgic_get_irq() has an awkward signature, as it takes both a kvm
*and* a vcpu, where the vcpu is allowed to be NULL if the INTID
being looked up is a global interrupt (SPI or LPI).
This leads to potentially problematic situations where the INTID
passed is a private interrupt, but that there is no vcpu.
In order to make things less ambiguous, let have *two* helpers
instead:
- vgic_get_irq(struct kvm *kvm, u32 intid), which is only concerned
with *global* interrupts, as indicated by the lack of vcpu.
- vgic_get_vcpu_irq(struct kvm_vcpu *vcpu, u32 intid), which can
return *any* interrupt class, but must have of course a non-NULL
vcpu.
Most of the code nicely falls under one or the other situations,
except for a couple of cases (close to the UABI or in the debug code)
where we have to distinguish between the two cases.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241117165757.247686-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-11-17 16:57:55 +00:00
|
|
|
struct vgic_irq *irq = vgic_get_vcpu_irq(vcpu, i);
|
2020-03-04 20:33:26 +00:00
|
|
|
struct irq_desc *desc;
|
|
|
|
unsigned long flags;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
raw_spin_lock_irqsave(&irq->irq_lock, flags);
|
|
|
|
|
|
|
|
if (!irq->hw)
|
|
|
|
goto unlock;
|
|
|
|
|
|
|
|
irq->hw = false;
|
|
|
|
ret = irq_get_irqchip_state(irq->host_irq,
|
|
|
|
IRQCHIP_STATE_PENDING,
|
|
|
|
&irq->pending_latch);
|
|
|
|
WARN_ON(ret);
|
|
|
|
|
|
|
|
desc = irq_to_desc(irq->host_irq);
|
|
|
|
irq_domain_deactivate_irq(irq_desc_get_irq_data(desc));
|
|
|
|
unlock:
|
|
|
|
raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
|
|
|
|
vgic_put_irq(vcpu->kvm, irq);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void vgic_v4_configure_vsgis(struct kvm *kvm)
|
|
|
|
{
|
|
|
|
struct vgic_dist *dist = &kvm->arch.vgic;
|
|
|
|
struct kvm_vcpu *vcpu;
|
2021-11-16 16:04:02 +00:00
|
|
|
unsigned long i;
|
2020-03-04 20:33:26 +00:00
|
|
|
|
2023-05-18 11:09:17 +01:00
|
|
|
lockdep_assert_held(&kvm->arch.config_lock);
|
|
|
|
|
2020-03-04 20:33:26 +00:00
|
|
|
kvm_arm_halt_guest(kvm);
|
|
|
|
|
|
|
|
kvm_for_each_vcpu(i, vcpu, kvm) {
|
|
|
|
if (dist->nassgireq)
|
|
|
|
vgic_v4_enable_vsgis(vcpu);
|
|
|
|
else
|
|
|
|
vgic_v4_disable_vsgis(vcpu);
|
|
|
|
}
|
|
|
|
|
|
|
|
kvm_arm_resume_guest(kvm);
|
|
|
|
}
|
|
|
|
|
2021-03-22 14:01:55 +08:00
|
|
|
/*
|
|
|
|
* Must be called with GICv4.1 and the vPE unmapped, which
|
|
|
|
* indicates the invalidation of any VPT caches associated
|
|
|
|
* with the vPE, thus we can get the VLPI state by peeking
|
|
|
|
* at the VPT.
|
|
|
|
*/
|
|
|
|
void vgic_v4_get_vlpi_state(struct vgic_irq *irq, bool *val)
|
|
|
|
{
|
|
|
|
struct its_vpe *vpe = &irq->target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
|
|
|
|
int mask = BIT(irq->intid % BITS_PER_BYTE);
|
|
|
|
void *va;
|
|
|
|
u8 *ptr;
|
|
|
|
|
|
|
|
va = page_address(vpe->vpt_page);
|
|
|
|
ptr = va + irq->intid / BITS_PER_BYTE;
|
|
|
|
|
|
|
|
*val = !!(*ptr & mask);
|
|
|
|
}
|
|
|
|
|
2023-01-19 11:07:59 +00:00
|
|
|
int vgic_v4_request_vpe_irq(struct kvm_vcpu *vcpu, int irq)
|
|
|
|
{
|
|
|
|
return request_irq(irq, vgic_v4_doorbell_handler, 0, "vcpu", vcpu);
|
|
|
|
}
|
|
|
|
|
2017-10-27 15:28:38 +01:00
|
|
|
/**
|
|
|
|
* vgic_v4_init - Initialize the GICv4 data structures
|
|
|
|
* @kvm: Pointer to the VM being initialized
|
|
|
|
*
|
|
|
|
* We may be called each time a vITS is created, or when the
|
2023-03-27 16:47:47 +00:00
|
|
|
* vgic is initialized. In both cases, the number of vcpus
|
|
|
|
* should now be fixed.
|
2017-10-27 15:28:38 +01:00
|
|
|
*/
|
|
|
|
int vgic_v4_init(struct kvm *kvm)
|
|
|
|
{
|
|
|
|
struct vgic_dist *dist = &kvm->arch.vgic;
|
|
|
|
struct kvm_vcpu *vcpu;
|
2021-11-16 16:04:02 +00:00
|
|
|
int nr_vcpus, ret;
|
|
|
|
unsigned long i;
|
2017-10-27 15:28:38 +01:00
|
|
|
|
2023-03-27 16:47:47 +00:00
|
|
|
lockdep_assert_held(&kvm->arch.config_lock);
|
|
|
|
|
2018-01-12 11:40:21 +01:00
|
|
|
if (!kvm_vgic_global_state.has_gicv4)
|
2017-11-10 09:16:23 +01:00
|
|
|
return 0; /* Nothing to see here... move along. */
|
|
|
|
|
2017-10-27 15:28:38 +01:00
|
|
|
if (dist->its_vm.vpes)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
nr_vcpus = atomic_read(&kvm->online_vcpus);
|
|
|
|
|
treewide: kzalloc() -> kcalloc()
The kzalloc() function has a 2-factor argument form, kcalloc(). This
patch replaces cases of:
kzalloc(a * b, gfp)
with:
kcalloc(a * b, gfp)
as well as handling cases of:
kzalloc(a * b * c, gfp)
with:
kzalloc(array3_size(a, b, c), gfp)
as it's slightly less ugly than:
kzalloc_array(array_size(a, b), c, gfp)
This does, however, attempt to ignore constant size factors like:
kzalloc(4 * 1024, gfp)
though any constants defined via macros get caught up in the conversion.
Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.
The Coccinelle script used for this was:
// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@
(
kzalloc(
- (sizeof(TYPE)) * E
+ sizeof(TYPE) * E
, ...)
|
kzalloc(
- (sizeof(THING)) * E
+ sizeof(THING) * E
, ...)
)
// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@
(
kzalloc(
- sizeof(u8) * (COUNT)
+ COUNT
, ...)
|
kzalloc(
- sizeof(__u8) * (COUNT)
+ COUNT
, ...)
|
kzalloc(
- sizeof(char) * (COUNT)
+ COUNT
, ...)
|
kzalloc(
- sizeof(unsigned char) * (COUNT)
+ COUNT
, ...)
|
kzalloc(
- sizeof(u8) * COUNT
+ COUNT
, ...)
|
kzalloc(
- sizeof(__u8) * COUNT
+ COUNT
, ...)
|
kzalloc(
- sizeof(char) * COUNT
+ COUNT
, ...)
|
kzalloc(
- sizeof(unsigned char) * COUNT
+ COUNT
, ...)
)
// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@
(
- kzalloc
+ kcalloc
(
- sizeof(TYPE) * (COUNT_ID)
+ COUNT_ID, sizeof(TYPE)
, ...)
|
- kzalloc
+ kcalloc
(
- sizeof(TYPE) * COUNT_ID
+ COUNT_ID, sizeof(TYPE)
, ...)
|
- kzalloc
+ kcalloc
(
- sizeof(TYPE) * (COUNT_CONST)
+ COUNT_CONST, sizeof(TYPE)
, ...)
|
- kzalloc
+ kcalloc
(
- sizeof(TYPE) * COUNT_CONST
+ COUNT_CONST, sizeof(TYPE)
, ...)
|
- kzalloc
+ kcalloc
(
- sizeof(THING) * (COUNT_ID)
+ COUNT_ID, sizeof(THING)
, ...)
|
- kzalloc
+ kcalloc
(
- sizeof(THING) * COUNT_ID
+ COUNT_ID, sizeof(THING)
, ...)
|
- kzalloc
+ kcalloc
(
- sizeof(THING) * (COUNT_CONST)
+ COUNT_CONST, sizeof(THING)
, ...)
|
- kzalloc
+ kcalloc
(
- sizeof(THING) * COUNT_CONST
+ COUNT_CONST, sizeof(THING)
, ...)
)
// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@
- kzalloc
+ kcalloc
(
- SIZE * COUNT
+ COUNT, SIZE
, ...)
// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@
(
kzalloc(
- sizeof(TYPE) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kzalloc(
- sizeof(TYPE) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kzalloc(
- sizeof(TYPE) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kzalloc(
- sizeof(TYPE) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kzalloc(
- sizeof(THING) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kzalloc(
- sizeof(THING) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kzalloc(
- sizeof(THING) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kzalloc(
- sizeof(THING) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
)
// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@
(
kzalloc(
- sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kzalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kzalloc(
- sizeof(THING1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kzalloc(
- sizeof(THING1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kzalloc(
- sizeof(TYPE1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
|
kzalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
)
// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@
(
kzalloc(
- (COUNT) * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kzalloc(
- COUNT * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kzalloc(
- COUNT * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kzalloc(
- (COUNT) * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kzalloc(
- COUNT * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kzalloc(
- (COUNT) * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kzalloc(
- (COUNT) * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kzalloc(
- COUNT * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
)
// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@
(
kzalloc(C1 * C2 * C3, ...)
|
kzalloc(
- (E1) * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
|
kzalloc(
- (E1) * (E2) * E3
+ array3_size(E1, E2, E3)
, ...)
|
kzalloc(
- (E1) * (E2) * (E3)
+ array3_size(E1, E2, E3)
, ...)
|
kzalloc(
- E1 * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
)
// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@
(
kzalloc(sizeof(THING) * C2, ...)
|
kzalloc(sizeof(TYPE) * C2, ...)
|
kzalloc(C1 * C2 * C3, ...)
|
kzalloc(C1 * C2, ...)
|
- kzalloc
+ kcalloc
(
- sizeof(TYPE) * (E2)
+ E2, sizeof(TYPE)
, ...)
|
- kzalloc
+ kcalloc
(
- sizeof(TYPE) * E2
+ E2, sizeof(TYPE)
, ...)
|
- kzalloc
+ kcalloc
(
- sizeof(THING) * (E2)
+ E2, sizeof(THING)
, ...)
|
- kzalloc
+ kcalloc
(
- sizeof(THING) * E2
+ E2, sizeof(THING)
, ...)
|
- kzalloc
+ kcalloc
(
- (E1) * E2
+ E1, E2
, ...)
|
- kzalloc
+ kcalloc
(
- (E1) * (E2)
+ E1, E2
, ...)
|
- kzalloc
+ kcalloc
(
- E1 * E2
+ E1, E2
, ...)
)
Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12 14:03:40 -07:00
|
|
|
dist->its_vm.vpes = kcalloc(nr_vcpus, sizeof(*dist->its_vm.vpes),
|
2021-09-07 20:31:11 +08:00
|
|
|
GFP_KERNEL_ACCOUNT);
|
2017-10-27 15:28:38 +01:00
|
|
|
if (!dist->its_vm.vpes)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
|
|
|
dist->its_vm.nr_vpes = nr_vcpus;
|
|
|
|
|
|
|
|
kvm_for_each_vcpu(i, vcpu, kvm)
|
|
|
|
dist->its_vm.vpes[i] = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
|
|
|
|
|
|
|
|
ret = its_alloc_vcpu_irqs(&dist->its_vm);
|
|
|
|
if (ret < 0) {
|
|
|
|
kvm_err("VPE IRQ allocation failure\n");
|
|
|
|
kfree(dist->its_vm.vpes);
|
|
|
|
dist->its_vm.nr_vpes = 0;
|
|
|
|
dist->its_vm.vpes = NULL;
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2017-10-27 15:28:48 +01:00
|
|
|
kvm_for_each_vcpu(i, vcpu, kvm) {
|
|
|
|
int irq = dist->its_vm.vpes[i]->irq;
|
2020-03-04 20:33:24 +00:00
|
|
|
unsigned long irq_flags = DB_IRQ_FLAGS;
|
2017-10-27 15:28:48 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Don't automatically enable the doorbell, as we're
|
|
|
|
* flipping it back and forth when the vcpu gets
|
|
|
|
* blocked. Also disable the lazy disabling, as the
|
|
|
|
* doorbell could kick us out of the guest too
|
|
|
|
* early...
|
2020-03-04 20:33:24 +00:00
|
|
|
*
|
|
|
|
* On GICv4.1, the doorbell is managed in HW and must
|
|
|
|
* be left enabled.
|
2017-10-27 15:28:48 +01:00
|
|
|
*/
|
2020-03-04 20:33:24 +00:00
|
|
|
if (kvm_vgic_global_state.has_gicv4_1)
|
|
|
|
irq_flags &= ~IRQ_NOAUTOEN;
|
|
|
|
irq_set_status_flags(irq, irq_flags);
|
|
|
|
|
2023-01-19 11:07:59 +00:00
|
|
|
ret = vgic_v4_request_vpe_irq(vcpu, irq);
|
2017-10-27 15:28:48 +01:00
|
|
|
if (ret) {
|
|
|
|
kvm_err("failed to allocate vcpu IRQ%d\n", irq);
|
|
|
|
/*
|
|
|
|
* Trick: adjust the number of vpes so we know
|
|
|
|
* how many to nuke on teardown...
|
|
|
|
*/
|
|
|
|
dist->its_vm.nr_vpes = i;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (ret)
|
|
|
|
vgic_v4_teardown(kvm);
|
|
|
|
|
2017-10-27 15:28:38 +01:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* vgic_v4_teardown - Free the GICv4 data structures
|
|
|
|
* @kvm: Pointer to the VM being destroyed
|
|
|
|
*/
|
|
|
|
void vgic_v4_teardown(struct kvm *kvm)
|
|
|
|
{
|
|
|
|
struct its_vm *its_vm = &kvm->arch.vgic.its_vm;
|
2017-10-27 15:28:48 +01:00
|
|
|
int i;
|
2017-10-27 15:28:38 +01:00
|
|
|
|
2023-03-27 16:47:47 +00:00
|
|
|
lockdep_assert_held(&kvm->arch.config_lock);
|
|
|
|
|
2017-10-27 15:28:38 +01:00
|
|
|
if (!its_vm->vpes)
|
|
|
|
return;
|
|
|
|
|
2017-10-27 15:28:48 +01:00
|
|
|
for (i = 0; i < its_vm->nr_vpes; i++) {
|
|
|
|
struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, i);
|
|
|
|
int irq = its_vm->vpes[i]->irq;
|
|
|
|
|
2017-10-27 15:28:53 +01:00
|
|
|
irq_clear_status_flags(irq, DB_IRQ_FLAGS);
|
2017-10-27 15:28:48 +01:00
|
|
|
free_irq(irq, vcpu);
|
|
|
|
}
|
|
|
|
|
2017-10-27 15:28:38 +01:00
|
|
|
its_free_vcpu_irqs(its_vm);
|
|
|
|
kfree(its_vm->vpes);
|
|
|
|
its_vm->nr_vpes = 0;
|
|
|
|
its_vm->vpes = NULL;
|
|
|
|
}
|
2017-10-27 15:28:39 +01:00
|
|
|
|
2025-02-25 17:29:26 +00:00
|
|
|
static inline bool vgic_v4_want_doorbell(struct kvm_vcpu *vcpu)
|
|
|
|
{
|
|
|
|
if (vcpu_get_flag(vcpu, IN_WFI))
|
|
|
|
return true;
|
|
|
|
|
|
|
|
if (likely(!vcpu_has_nv(vcpu)))
|
|
|
|
return false;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* GICv4 hardware is only ever used for the L1. Mark the vPE (i.e. the
|
|
|
|
* L1 context) nonresident and request a doorbell to kick us out of the
|
|
|
|
* L2 when an IRQ becomes pending.
|
|
|
|
*/
|
|
|
|
return vcpu_get_flag(vcpu, IN_NESTED_ERET);
|
|
|
|
}
|
|
|
|
|
2023-07-13 08:06:57 +01:00
|
|
|
int vgic_v4_put(struct kvm_vcpu *vcpu)
|
2017-10-27 15:28:50 +01:00
|
|
|
{
|
KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/put
When the VHE code was reworked, a lot of the vgic stuff was moved around,
but the GICv4 residency code did stay untouched, meaning that we come
in and out of residency on each flush/sync, which is obviously suboptimal.
To address this, let's move things around a bit:
- Residency entry (flush) moves to vcpu_load
- Residency exit (sync) moves to vcpu_put
- On blocking (entry to WFI), we "put"
- On unblocking (exit from WFI), we "load"
Because these can nest (load/block/put/load/unblock/put, for example),
we now have per-VPE tracking of the residency state.
Additionally, vgic_v4_put gains a "need doorbell" parameter, which only
gets set to true when blocking because of a WFI. This allows a finer
control of the doorbell, which now also gets disabled as soon as
it gets signaled.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20191027144234.8395-2-maz@kernel.org
2019-10-27 14:41:59 +00:00
|
|
|
struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
|
|
|
|
|
2025-07-23 23:28:00 -07:00
|
|
|
if (!vgic_supports_direct_irqs(vcpu->kvm) || !vpe->resident)
|
2017-10-27 15:28:50 +01:00
|
|
|
return 0;
|
|
|
|
|
2025-02-25 17:29:26 +00:00
|
|
|
return its_make_vpe_non_resident(vpe, vgic_v4_want_doorbell(vcpu));
|
2017-10-27 15:28:50 +01:00
|
|
|
}
|
|
|
|
|
KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/put
When the VHE code was reworked, a lot of the vgic stuff was moved around,
but the GICv4 residency code did stay untouched, meaning that we come
in and out of residency on each flush/sync, which is obviously suboptimal.
To address this, let's move things around a bit:
- Residency entry (flush) moves to vcpu_load
- Residency exit (sync) moves to vcpu_put
- On blocking (entry to WFI), we "put"
- On unblocking (exit from WFI), we "load"
Because these can nest (load/block/put/load/unblock/put, for example),
we now have per-VPE tracking of the residency state.
Additionally, vgic_v4_put gains a "need doorbell" parameter, which only
gets set to true when blocking because of a WFI. This allows a finer
control of the doorbell, which now also gets disabled as soon as
it gets signaled.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20191027144234.8395-2-maz@kernel.org
2019-10-27 14:41:59 +00:00
|
|
|
int vgic_v4_load(struct kvm_vcpu *vcpu)
|
2017-10-27 15:28:50 +01:00
|
|
|
{
|
KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/put
When the VHE code was reworked, a lot of the vgic stuff was moved around,
but the GICv4 residency code did stay untouched, meaning that we come
in and out of residency on each flush/sync, which is obviously suboptimal.
To address this, let's move things around a bit:
- Residency entry (flush) moves to vcpu_load
- Residency exit (sync) moves to vcpu_put
- On blocking (entry to WFI), we "put"
- On unblocking (exit from WFI), we "load"
Because these can nest (load/block/put/load/unblock/put, for example),
we now have per-VPE tracking of the residency state.
Additionally, vgic_v4_put gains a "need doorbell" parameter, which only
gets set to true when blocking because of a WFI. This allows a finer
control of the doorbell, which now also gets disabled as soon as
it gets signaled.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20191027144234.8395-2-maz@kernel.org
2019-10-27 14:41:59 +00:00
|
|
|
struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
|
2017-10-27 15:28:50 +01:00
|
|
|
int err;
|
|
|
|
|
2025-07-23 23:28:00 -07:00
|
|
|
if (!vgic_supports_direct_irqs(vcpu->kvm) || vpe->resident)
|
2017-10-27 15:28:50 +01:00
|
|
|
return 0;
|
|
|
|
|
2023-07-13 08:06:57 +01:00
|
|
|
if (vcpu_get_flag(vcpu, IN_WFI))
|
|
|
|
return 0;
|
|
|
|
|
2017-10-27 15:28:50 +01:00
|
|
|
/*
|
|
|
|
* Before making the VPE resident, make sure the redistributor
|
|
|
|
* corresponding to our current CPU expects us here. See the
|
|
|
|
* doc in drivers/irqchip/irq-gic-v4.c to understand how this
|
|
|
|
* turns into a VMOVP command at the ITS level.
|
|
|
|
*/
|
KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/put
When the VHE code was reworked, a lot of the vgic stuff was moved around,
but the GICv4 residency code did stay untouched, meaning that we come
in and out of residency on each flush/sync, which is obviously suboptimal.
To address this, let's move things around a bit:
- Residency entry (flush) moves to vcpu_load
- Residency exit (sync) moves to vcpu_put
- On blocking (entry to WFI), we "put"
- On unblocking (exit from WFI), we "load"
Because these can nest (load/block/put/load/unblock/put, for example),
we now have per-VPE tracking of the residency state.
Additionally, vgic_v4_put gains a "need doorbell" parameter, which only
gets set to true when blocking because of a WFI. This allows a finer
control of the doorbell, which now also gets disabled as soon as
it gets signaled.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20191027144234.8395-2-maz@kernel.org
2019-10-27 14:41:59 +00:00
|
|
|
err = irq_set_affinity(vpe->irq, cpumask_of(smp_processor_id()));
|
2017-10-27 15:28:50 +01:00
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
2020-03-04 20:33:20 +00:00
|
|
|
err = its_make_vpe_resident(vpe, false, vcpu->kvm->arch.vgic.enabled);
|
2017-10-27 15:28:50 +01:00
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Now that the VPE is resident, let's get rid of a potential
|
2020-03-04 20:33:20 +00:00
|
|
|
* doorbell interrupt that would still be pending. This is a
|
|
|
|
* GICv4.0 only "feature"...
|
2017-10-27 15:28:50 +01:00
|
|
|
*/
|
2020-03-04 20:33:20 +00:00
|
|
|
if (!kvm_vgic_global_state.has_gicv4_1)
|
|
|
|
err = irq_set_irqchip_state(vpe->irq, IRQCHIP_STATE_PENDING, false);
|
|
|
|
|
|
|
|
return err;
|
2017-10-27 15:28:50 +01:00
|
|
|
}
|
|
|
|
|
2020-11-28 22:18:57 +08:00
|
|
|
void vgic_v4_commit(struct kvm_vcpu *vcpu)
|
|
|
|
{
|
|
|
|
struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* No need to wait for the vPE to be ready across a shallow guest
|
|
|
|
* exit, as only a vcpu_put will invalidate it.
|
|
|
|
*/
|
|
|
|
if (!vpe->ready)
|
|
|
|
its_commit_vpe(vpe);
|
|
|
|
}
|
|
|
|
|
2017-10-27 15:28:39 +01:00
|
|
|
static struct vgic_its *vgic_get_its(struct kvm *kvm,
|
|
|
|
struct kvm_kernel_irq_routing_entry *irq_entry)
|
|
|
|
{
|
|
|
|
struct kvm_msi msi = (struct kvm_msi) {
|
|
|
|
.address_lo = irq_entry->msi.address_lo,
|
|
|
|
.address_hi = irq_entry->msi.address_hi,
|
|
|
|
.data = irq_entry->msi.data,
|
|
|
|
.flags = irq_entry->msi.flags,
|
|
|
|
.devid = irq_entry->msi.devid,
|
|
|
|
};
|
|
|
|
|
|
|
|
return vgic_msi_to_its(kvm, &msi);
|
|
|
|
}
|
|
|
|
|
|
|
|
int kvm_vgic_v4_set_forwarding(struct kvm *kvm, int virq,
|
|
|
|
struct kvm_kernel_irq_routing_entry *irq_entry)
|
|
|
|
{
|
|
|
|
struct vgic_its *its;
|
|
|
|
struct vgic_irq *irq;
|
|
|
|
struct its_vlpi_map map;
|
2021-03-22 14:01:57 +08:00
|
|
|
unsigned long flags;
|
2025-02-26 10:31:23 -08:00
|
|
|
int ret = 0;
|
2017-10-27 15:28:39 +01:00
|
|
|
|
|
|
|
if (!vgic_supports_direct_msis(kvm))
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Get the ITS, and escape early on error (not a valid
|
|
|
|
* doorbell for any of our vITSs).
|
|
|
|
*/
|
|
|
|
its = vgic_get_its(kvm, irq_entry);
|
|
|
|
if (IS_ERR(its))
|
|
|
|
return 0;
|
|
|
|
|
2025-05-23 12:47:18 -07:00
|
|
|
guard(mutex)(&its->its_lock);
|
2017-10-27 15:28:39 +01:00
|
|
|
|
2025-02-26 10:31:23 -08:00
|
|
|
/*
|
|
|
|
* Perform the actual DevID/EventID -> LPI translation.
|
|
|
|
*
|
|
|
|
* Silently exit if translation fails as the guest (or userspace!) has
|
|
|
|
* managed to do something stupid. Emulated LPI injection will still
|
|
|
|
* work if the guest figures itself out at a later time.
|
|
|
|
*/
|
|
|
|
if (vgic_its_resolve_lpi(kvm, its, irq_entry->msi.devid,
|
|
|
|
irq_entry->msi.data, &irq))
|
2025-05-23 12:47:18 -07:00
|
|
|
return 0;
|
2017-10-27 15:28:39 +01:00
|
|
|
|
2025-05-23 12:47:19 -07:00
|
|
|
raw_spin_lock_irqsave(&irq->irq_lock, flags);
|
|
|
|
|
2023-11-20 21:12:10 +08:00
|
|
|
/* Silently exit if the vLPI is already mapped */
|
|
|
|
if (irq->hw)
|
2025-05-23 12:47:19 -07:00
|
|
|
goto out_unlock_irq;
|
2023-11-20 21:12:10 +08:00
|
|
|
|
2017-10-27 15:28:39 +01:00
|
|
|
/*
|
|
|
|
* Emit the mapping request. If it fails, the ITS probably
|
|
|
|
* isn't v4 compatible, so let's silently bail out. Holding
|
|
|
|
* the ITS lock should ensure that nothing can modify the
|
|
|
|
* target vcpu.
|
|
|
|
*/
|
|
|
|
map = (struct its_vlpi_map) {
|
|
|
|
.vm = &kvm->arch.vgic.its_vm,
|
|
|
|
.vpe = &irq->target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe,
|
|
|
|
.vintid = irq->intid,
|
|
|
|
.properties = ((irq->priority & 0xfc) |
|
|
|
|
(irq->enabled ? LPI_PROP_ENABLED : 0) |
|
|
|
|
LPI_PROP_GROUP1),
|
|
|
|
.db_enabled = true,
|
|
|
|
};
|
|
|
|
|
|
|
|
ret = its_map_vlpi(virq, &map);
|
|
|
|
if (ret)
|
2025-05-23 12:47:19 -07:00
|
|
|
goto out_unlock_irq;
|
2017-10-27 15:28:39 +01:00
|
|
|
|
|
|
|
irq->hw = true;
|
|
|
|
irq->host_irq = virq;
|
2019-11-07 16:04:11 +00:00
|
|
|
atomic_inc(&map.vpe->vlpi_count);
|
2017-10-27 15:28:39 +01:00
|
|
|
|
2021-03-22 14:01:57 +08:00
|
|
|
/* Transfer pending state */
|
2025-05-23 12:47:19 -07:00
|
|
|
if (!irq->pending_latch)
|
|
|
|
goto out_unlock_irq;
|
2021-03-22 14:01:57 +08:00
|
|
|
|
2025-05-23 12:47:19 -07:00
|
|
|
ret = irq_set_irqchip_state(irq->host_irq, IRQCHIP_STATE_PENDING,
|
|
|
|
irq->pending_latch);
|
|
|
|
WARN_RATELIMIT(ret, "IRQ %d", irq->host_irq);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Clear pending_latch and communicate this state
|
|
|
|
* change via vgic_queue_irq_unlock.
|
|
|
|
*/
|
|
|
|
irq->pending_latch = false;
|
|
|
|
vgic_queue_irq_unlock(kvm, irq, flags);
|
|
|
|
return ret;
|
2021-03-22 14:01:57 +08:00
|
|
|
|
2025-05-23 12:47:19 -07:00
|
|
|
out_unlock_irq:
|
|
|
|
raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
|
2017-10-27 15:28:39 +01:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2025-05-23 12:47:20 -07:00
|
|
|
static struct vgic_irq *__vgic_host_irq_get_vlpi(struct kvm *kvm, int host_irq)
|
|
|
|
{
|
|
|
|
struct vgic_irq *irq;
|
|
|
|
unsigned long idx;
|
|
|
|
|
|
|
|
guard(rcu)();
|
|
|
|
xa_for_each(&kvm->arch.vgic.lpi_xa, idx, irq) {
|
|
|
|
if (!irq->hw || irq->host_irq != host_irq)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
if (!vgic_try_get_irq_kref(irq))
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
return irq;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2025-06-11 15:45:35 -07:00
|
|
|
void kvm_vgic_v4_unset_forwarding(struct kvm *kvm, int host_irq)
|
2017-10-27 15:28:39 +01:00
|
|
|
{
|
|
|
|
struct vgic_irq *irq;
|
2025-05-23 12:47:19 -07:00
|
|
|
unsigned long flags;
|
2017-10-27 15:28:39 +01:00
|
|
|
|
|
|
|
if (!vgic_supports_direct_msis(kvm))
|
2025-06-11 15:45:35 -07:00
|
|
|
return;
|
2017-10-27 15:28:39 +01:00
|
|
|
|
2025-05-23 12:47:20 -07:00
|
|
|
irq = __vgic_host_irq_get_vlpi(kvm, host_irq);
|
|
|
|
if (!irq)
|
2025-06-11 15:45:35 -07:00
|
|
|
return;
|
2017-10-27 15:28:39 +01:00
|
|
|
|
2025-05-23 12:47:19 -07:00
|
|
|
raw_spin_lock_irqsave(&irq->irq_lock, flags);
|
2025-05-23 12:47:20 -07:00
|
|
|
WARN_ON(irq->hw && irq->host_irq != host_irq);
|
2017-11-16 17:58:19 +00:00
|
|
|
if (irq->hw) {
|
2019-11-07 16:04:11 +00:00
|
|
|
atomic_dec(&irq->target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe.vlpi_count);
|
2017-11-16 17:58:19 +00:00
|
|
|
irq->hw = false;
|
KVM: arm64: WARN if unmapping a vLPI fails in any path
When unmapping a vLPI, WARN if nullifying vCPU affinity fails, not just if
failure occurs when freeing an ITE. If undoing vCPU affinity fails, then
odds are very good that vLPI state tracking has has gotten out of whack,
i.e. that KVM and the GIC disagree on the state of an IRQ/vLPI. At best,
inconsistent state means there is a lurking bug/flaw somewhere. At worst,
the inconsistency could eventually be fatal to the host, e.g. if an ITS
command fails because KVM's view of things doesn't match reality/hardware.
Note, only the call from kvm_arch_irq_bypass_del_producer() by way of
kvm_vgic_v4_unset_forwarding() doesn't already WARN. Common KVM's
kvm_irq_routing_update() WARNs if kvm_arch_update_irqfd_routing() fails.
For that path, if its_unmap_vlpi() fails in kvm_vgic_v4_unset_forwarding(),
the only possible causes are that the GIC doesn't have a v4 ITS (from
its_irq_set_vcpu_affinity()):
/* Need a v4 ITS */
if (!is_v4(its_dev->its))
return -EINVAL;
guard(raw_spinlock)(&its_dev->event_map.vlpi_lock);
/* Unmap request? */
if (!info)
return its_vlpi_unmap(d);
or that KVM has gotten out of sync with the GIC/ITS (from its_vlpi_unmap()):
if (!its_dev->event_map.vm || !irqd_is_forwarded_to_vcpu(d))
return -EINVAL;
All of the above failure scenarios are warnable offences, as they should
never occur absent a kernel/KVM bug.
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/all/aFWY2LTVIxz5rfhh@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-06-12 16:51:47 -07:00
|
|
|
its_unmap_vlpi(host_irq);
|
2017-11-16 17:58:19 +00:00
|
|
|
}
|
2017-10-27 15:28:39 +01:00
|
|
|
|
2025-05-23 12:47:19 -07:00
|
|
|
raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
|
2025-05-23 12:47:20 -07:00
|
|
|
vgic_put_irq(kvm, irq);
|
2017-10-27 15:28:39 +01:00
|
|
|
}
|