Re: [PATCH v7 3/3] KVM: s390: Introducing kvm_arch_set_irq_inatomic fast inject
From: Matthew Rosato
Date: Fri May 22 2026 - 13:57:10 EST
On 5/20/26 12:04 PM, Douglas Freimuth wrote:
> s390 needs a fast path for irq injection, and along those lines we
> introduce kvm_arch_set_irq_inatomic. Instead of placing all interrupts on
> the global work queue as it does today, this patch provides a fast path for
> irq injection.
>
> The inatomic fast path cannot lose control since it is running with
> interrupts disabled. This meant making the following changes that exist on
> the slow path today. First, the adapter_indicators page needs to be mapped
> since it is accessed with interrupts disabled, so we added map/unmap
> functions. Second, access to shared resources between the fast and slow
> paths needed to be changed from mutex and semaphores to spin_lock's.
> Finally, the memory allocation on the slow path utilizes GFP_KERNEL_ACCOUNT
> but we had to implement the fast path with GFP_ATOMIC allocation. Each of
> these enhancements were required to prevent blocking on the fast inject
> path.
>
> Fencing of Fast Inject in Secure Execution environments is enabled in the
> patch series by not mapping adapter indicator pages. In Secure Execution
> environments the path of execution available before this patch is followed.
>
> Statistical counters have been added to enable analysis of irq injection on
> the fast path and slow path including io_390_inatomic, io_flic_inject_airq,
> io_set_adapter_int and io_390_inatomic_adapter_masked.
>
> Signed-off-by: Douglas Freimuth <freimuth@xxxxxxxxxxxxx>
[...]
> int kvm_s390_inject_vm(struct kvm *kvm,
> - struct kvm_s390_interrupt *s390int)
> + struct kvm_s390_interrupt *s390int, struct kvm_s390_interrupt_info *inti)
> {
> - struct kvm_s390_interrupt_info *inti;
> int rc;
>
> - inti = kzalloc_obj(*inti, GFP_KERNEL_ACCOUNT);
> - if (!inti)
> - return -ENOMEM;
> -
> inti->type = s390int->type;
> switch (inti->type) {
> case KVM_S390_INT_VIRTIO:
> @@ -2010,8 +2005,7 @@ int kvm_s390_inject_vm(struct kvm *kvm,
> 2);
>
> rc = __inject_vm(kvm, inti);
> - if (rc)
> - kfree(inti);
You removed this kfree, but there's still one in the default case
above. Remove that as well so the caller is always responsible for
both allocating and freeing the kvm_s390_interrupt_info.
This appears to match the Sashiko complaint about a double-free, but
you want to fix it here -- not at the individual caller(s).
> +
> return rc;
> }
>
[...]
> +
> +/*
> + * kvm_arch_set_irq_inatomic: fast-path for irqfd injection
> + */
> +int kvm_arch_set_irq_inatomic(struct kvm_kernel_irq_routing_entry *e,
> + struct kvm *kvm, int irq_source_id, int level,
> + bool line_status)
> +{
> + int ret, setbit;
> + struct s390_io_adapter *adapter;
> + struct kvm_s390_float_interrupt *fi = &kvm->arch.float_int;
> + struct kvm_s390_interrupt_info *inti;
> + struct kvm_s390_interrupt s390int = {
> + .type = KVM_S390_INT_IO(1, 0, 0, 0),
> + .parm = 0,
> + };
> +
> + kvm->stat.io_390_inatomic++;
> +
> + /* We're only interested in the 0->1 transition. */
> + if (!level)
> + return -EWOULDBLOCK;
Sashiko makes a good point here; if we -EWOULDBLOCK, it's just going to
fall back to the irqfd_inject path which will drive set_adapter_int()
which will always return 0 due to the same !level check.
So while functionally it's not harmful, it would be an optimization to
simply return 0 here. Please test it out.
> + if (e->type != KVM_IRQ_ROUTING_S390_ADAPTER)
> + return -EWOULDBLOCK;
> +
> + adapter = get_io_adapter(kvm, e->adapter.adapter_id);
> + if (!adapter)
> + return -EWOULDBLOCK;
> +
> + s390int.parm64 = isc_to_int_word(adapter->isc);
> + setbit = 1;
> + ret = adapter_indicators_set_fast(kvm, adapter, &e->adapter, setbit);
> + if (ret < 0)
> + return -EWOULDBLOCK;
> + if (!ret || adapter->masked) {
Sashiko made me double-check your work here. AFAICT this is correct:
Let's compare it to set_adapter_int(), where we inject only if
if ((ret > 0) && !adapter->masked) {
ret = kvm_s390_inject_airq(kvm, adapter);
Here we are trying to decide whether to NOT inject, so invert it:
if (ret <= 0 || adapter->masked)
and you already checked ret < 0 above, so:
if (ret == 0 || adapter->masked)
or
if (!ret || adapter->masked)
I guess it's more a question of whether you want to consider masked
interrupts (adapter->masked) as a different counter than coalesced
interrupts (!ret). I'm not sure we care that much to track this
distinction, but I think the sum counter is still valuable; it's how
many times times we delivered interrupts in atomic without actually
injecting the adapter interrupt. (Actually, not quite true, see below)
All of this is a long-winded way of saying 'maybe you should rename
the counter to more-accurately describe what you're counting'.
And don't forget to update the commit message.
> + kvm->stat.io_390_inatomic_adapter_masked++;
> + return 0;
> + }
> +
> + inti = kzalloc_obj(*inti, GFP_ATOMIC);
> + if (!inti) {
> + setbit = 0;
> + adapter_indicators_set_fast(kvm, adapter, &e->adapter, setbit);
> + return -EWOULDBLOCK;
> + }
> +
> + if (!test_kvm_facility(kvm, 72) || !adapter->suppressible) {
Please look into the sashiko reports re: irqsave and the fi->lock.
> + ret = kvm_s390_inject_vm(kvm, &s390int, inti);
> + if (ret == 0) {
> + return ret;
> + } else {
> + setbit = 0;
> + adapter_indicators_set_fast(kvm, adapter, &e->adapter, setbit);
> + kfree(inti);
> + return -EWOULDBLOCK;
> + }
> + }
> +
> + spin_lock(&fi->ais_lock);
> + if (fi->nimm & AIS_MODE_MASK(adapter->isc)) {
> + trace_kvm_s390_airq_suppressed(adapter->id, adapter->isc);
> + spin_unlock(&fi->ais_lock);
> + kfree(inti);
Actually, about that comment above re: io_390_inatomic_adapter_masked..
Here is another place where we set the indicators without injecting an
adapter interrupt. Now due to adapter interrupt suppression.
If you want that counter to include all 3 (coalesced, masked, and
suppressed) then you should increment it here as well after you rename
it.
> + return 0;
> + }