Re: [6.12.y regression] Regression with 58130e7ce6cb ("PCI/ERR: Ensure error recoverability at all times"): echo vfio-pci >driver_override does not work for DVB Adapter
From: Lukas Wunner
Date: Sun Mar 29 2026 - 12:22:45 EST
On Sun, Mar 29, 2026 at 03:52:15PM +0200, Bernd Schumacher wrote:
> [ 20.660351] vfio-pci 0000:07:00.0: vgaarb: pci_notify
> [ 20.660357] vfio-pci 0000:07:00.0: runtime IRQ mapping not provided by arch
> [ 20.660547] vfio-pci 0000:07:00.0: restore config 0x14: 0x00000000 -> 0xffffffff
> [ 20.660615] vfio-pci 0000:07:00.0: save config 0x00: 0x0003dd01
> [ 20.660620] vfio-pci 0000:07:00.0: save config 0x04: 0x00100000
> [ 20.660626] vfio-pci 0000:07:00.0: save config 0x08: 0x04800000
> [ 20.660631] vfio-pci 0000:07:00.0: save config 0x0c: 0x00000010
> [ 20.660636] vfio-pci 0000:07:00.0: save config 0x10: 0xfc500004
> [ 20.660642] vfio-pci 0000:07:00.0: save config 0x14: 0xffffffff
> [ 20.660647] vfio-pci 0000:07:00.0: save config 0x18: 0x00000000
> [ 20.660652] vfio-pci 0000:07:00.0: save config 0x1c: 0x00000000
> [ 20.660657] vfio-pci 0000:07:00.0: save config 0x20: 0x00000000
> [ 20.660662] vfio-pci 0000:07:00.0: save config 0x24: 0x00000000
> [ 20.660667] vfio-pci 0000:07:00.0: save config 0x28: 0x00000000
> [ 20.660670] vfio-pci 0000:07:00.0: vgaarb: pci_notify
> [ 20.660672] vfio-pci 0000:07:00.0: save config 0x2c: 0x0020dd01
> [ 20.660677] vfio-pci 0000:07:00.0: save config 0x30: 0x00000000
> [ 20.660682] vfio-pci 0000:07:00.0: save config 0x34: 0x00000050
> [ 20.660687] vfio-pci 0000:07:00.0: save config 0x38: 0x00000000
> [ 20.660689] vfio-pci 0000:12:00.0: vgaarb: pci_notify
> [ 20.660692] vfio-pci 0000:07:00.0: save config 0x3c: 0x000001ff
The above is from the non-working kernel version 6.12.73. The BAR at
offset 0x14 in config space is restored and saved with a value of
"all ones" here (0xffffffff).
The working kernel version 6.12.63 is using "all zeroes" instead
(0x00000000).
I'm guessing that the initial pci_save_state() that the offending
commit inserted into device enumeration already saves the incorrect
0xffffffff value and that is subsequently restored by vfio-pci after
resetting the device through a D0 -> D3hot -> D0 transition.
On the working kernel, the pci_restore_state() performed by vfio-pci
probably becomes a no-op because no pci_save_state() was performed
beforehand.
Question is where the incorrect BAR value is coming from. This could
actually be a resource allocation issue that happens to manifest itself
as a passthrough failure. It's not visible in the dmesg output because
it is truncated.
Could you repeat this and add log_buf_len=16M to the kernel command line
so that the dmesg output isn't truncated?
Thanks!
Lukas