Re: [PATCH] KVM: arm64: vgic-its: Drop the translation cache reference only for the erased entry

From: Hyunwoo Kim

Date: Mon Jun 01 2026 - 21:32:24 EST


On Mon, Jun 01, 2026 at 12:08:55PM -0700, Oliver Upton wrote:
> Hi Hyunwoo,
>
> Nice find.
>
> On Mon, Jun 01, 2026 at 11:53:26PM +0900, Hyunwoo Kim wrote:
> > vgic_its_invalidate_cache() walks the per-ITS translation cache with
> > xa_for_each() and drops the cache's reference on each entry with
> > vgic_put_irq(). It puts the iterated pointer, though, rather than the
> > value returned by xa_erase().
> >
> > The function is called from contexts that do not exclude one another: the
> > ITS command handlers hold its_lock, the GITS_CTLR write path holds
> > cmd_lock, and the path that clears EnableLPIs in a redistributor's
> > GICR_CTLR holds neither. Two or more of them can drain the same cache
> > concurrently, and if each one observes the same entry, erases it and then
> > puts it, the single reference the cache holds on that entry is dropped
> > more than once. The entry can then be freed while an ITE still maps it.
> >
> > xa_erase() is atomic and returns the previous entry, so put only the entry
> > that this context actually removed. The cache reference is then dropped
> > exactly once per entry even when the invalidations run concurrently, and
> > the behavior is unchanged when only one context runs.
>
> Next time:
>
> Cc: stable@xxxxxxxxxxxxxxx
>
> > Fixes: 8201d1028caa ("KVM: arm64: vgic-its: Maintain a translation cache per ITS")
> > Signed-off-by: Hyunwoo Kim <imv4bel@xxxxxxxxx>
> > ---
> > arch/arm64/kvm/vgic/vgic-its.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
> > index 1d7e5d560af4..1e3706ac3b8e 100644
> > --- a/arch/arm64/kvm/vgic/vgic-its.c
> > +++ b/arch/arm64/kvm/vgic/vgic-its.c
> > @@ -597,8 +597,10 @@ static void vgic_its_invalidate_cache(struct vgic_its *its)
> > unsigned long idx;
> >
> > xa_for_each(&its->translation_cache, idx, irq) {
> > - xa_erase(&its->translation_cache, idx);
> > - vgic_put_irq(kvm, irq);
> > + /* Only the context that erases the entry drops its cache ref. */
> > + irq = xa_erase(&its->translation_cache, idx);
> > + if (irq)
> > + vgic_put_irq(kvm, irq);
> > }
> > }
>
> This definitely works but TBH I'd rather just plug the subtle race and
> do invalidations behind the its_lock since it already nests with the
> cmd_lock.

Thank you for the review.

>
> Could you give this a spin?

After testing, I've confirmed that this patch approach works well too.

Shall I submit v2 based on this fix?

>
> diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
> index 2ea9f1c7ebcd..b2225da212ec 100644
> --- a/arch/arm64/kvm/vgic/vgic-its.c
> +++ b/arch/arm64/kvm/vgic/vgic-its.c
> @@ -596,6 +596,8 @@ static void vgic_its_invalidate_cache(struct vgic_its *its)
> struct vgic_irq *irq;
> unsigned long idx;
>
> + lockdep_assert_held(&its->its_lock);
> +
> xa_for_each(&its->translation_cache, idx, irq) {
> xa_erase(&its->translation_cache, idx);
> vgic_put_irq(kvm, irq);
> @@ -607,17 +609,16 @@ void vgic_its_invalidate_all_caches(struct kvm *kvm)
> struct kvm_device *dev;
> struct vgic_its *its;
>
> - rcu_read_lock();
> + guard(mutex)(&kvm->lock);
>
> - list_for_each_entry_rcu(dev, &kvm->devices, vm_node) {
> + list_for_each_entry(dev, &kvm->devices, vm_node) {
> if (dev->ops != &kvm_arm_vgic_its_ops)
> continue;
>
> its = dev->private;
> + guard(mutex)(&its->its_lock);
> vgic_its_invalidate_cache(its);
> }
> -
> - rcu_read_unlock();
> }
>
> int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
> @@ -1725,8 +1726,10 @@ static void vgic_mmio_write_its_ctlr(struct kvm *kvm, struct vgic_its *its,
> goto out;
>
> its->enabled = !!(val & GITS_CTLR_ENABLE);
> - if (!its->enabled)
> + if (!its->enabled) {
> + guard(mutex)(&its->its_lock);
> vgic_its_invalidate_cache(its);
> + }
>
> /*
> * Try to process any pending commands. This function bails out early
>
> --
> Thanks,
> Oliver


Best regards,
Hyunwoo Kim