Re: [PATCH v2 2/3] vfio-pci/zdev: Add VFIO FMB device feature
From: Omar Elghoul
Date: Wed Jun 03 2026 - 08:39:36 EST
On 6/2/26 6:24 PM, Alex Williamson wrote:
On Tue, 19 May 2026 18:42:03 -0400Acked, will remove in v3.
Omar Elghoul <oelghoul@xxxxxxxxxxxxx> wrote:
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.cSomewhat gratuitous variable usage.
index 0990fdb146b7..1e9efe2bee69 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -167,3 +167,80 @@ void vfio_pci_zdev_close_device(struct vfio_pci_core_device *vdev)
if (zpci_kvm_hook.kvm_unregister)
zpci_kvm_hook.kvm_unregister(zdev);
}
+
+int vfio_pci_zdev_feature_fmb(struct vfio_pci_core_device *vdev, u32 flags,
+ void __user *arg, size_t argsz)
+{
+ struct zpci_dev *zdev;
+ struct vfio_device_feature_zpci_fmb fmb = {0};
+ u32 ops = VFIO_DEVICE_FEATURE_GET | VFIO_DEVICE_FEATURE_SET;
Acked as well
+ int ret;Use a guard and avoid the release_lock gotos.
+
+ ret = vfio_check_feature(flags, argsz, ops, sizeof(fmb));
+ if (ret != 1)
+ return ret;
+
+ zdev = to_zpci(vdev->pdev);
+ if (!zdev)
+ return -ENODEV;
+
+ mutex_lock(&zdev->fmb_lock);
Oversight on my part, I will fix this.
+ if (flags & VFIO_DEVICE_FEATURE_SET) {Remaining flag bits are not tested, breaks any future expanded use of
+ if (copy_from_user(&fmb, arg, sizeof(fmb))) {
+ ret = -EFAULT;
+ goto release_lock;
+ }
+
+ if (fmb.flags & VFIO_DEVICE_FEATURE_ZPCI_FMB_FLAGS_ENABLED)
+ ret = zpci_fmb_reenable_device(zdev);
+ else
+ ret = zpci_fmb_disable_device(zdev);
+ goto release_lock;
flags.
We want the user (e.g. QEMU) to be able to control these so that when a
Why does the user need to be able to control these?
guest enables or disables the FMB, this state gets cascaded to the host and
all the way to the firmware.
Yes it does, but this one isn't an oversight and is intentional behavior
Doesn't allowing the user to disable FMB remove guaranteed host-based
monitoring?
to achieve the functionality mentioned above. The host-based monitoring is
not necessarily guaranteed and is treated as a device-specific state, so it
makes sense in the case of passthrough to have that state reflect the state
of the guest that is actually using the device.
It might be possible but it would undoubtedly be really ugly and harder to
Since this is already provided via debugfs, why not make this a
userspace problem to interact with the existing interface?
maintain. I think what we'd dislike most about using debugfs is parsing
text data into the FMB structure. If any of the text representations of the
fields were to change, we would need to update them anywhere that uses them
(e.g., in QEMU or any other user driver). The ABI would be super fragile.
The firmware only writes the FMB into one buffer every firmware-specified
Alternatively, couldn't the existing zpci mediation be extended to
support the guest registering a fmb buffer to be written at regular
intervals (the interface here seems to drop the reporting interval).
interval. If we wanted to write the FMB directly into guest memory, we
would either 1) lose host access to the FMB or 2) have to run a periodic
worker in the kernel to copy the host FMB into the guest-provided buffer
every time the firmware does an update. I don't believe either of these
approaches are favorable.
WRT reporting interval, I intentionally dropped that one as it is already
provided by VFIO_DEVICE_INFO_CAP_ZPCI_GROUP.
I don't know how I missed this one... I'll fix it or remove it entirely by
+ }Flag bit is cleared, goto skips copy-to-user, returns success...
+
+ ret = 0;
+ if (zdev->fmb) {
+ fmb.flags |= VFIO_DEVICE_FEATURE_ZPCI_FMB_FLAGS_ENABLED;
+ } else {
+ fmb.flags &= ~VFIO_DEVICE_FEATURE_ZPCI_FMB_FLAGS_ENABLED;
+ goto release_lock;
+ }
using the guard like you suggested earlier.
Acked.
Regardless of what's documented in the header, the buffer should be
assumed to be userspace garbage. Failing to set or clear the entire
flags field precludes any future use.
Good point, I had not previously considered using -ENOMSG. It makes sense
Why do we need to use flags to indicate the enable state? Couldn't we
just as easily have success indicate enabled and -ENOMSG indicate
disabled? Thanks,
and would for sure simplify this feature.
Thanks.
Alex