Re: [PATCH v10 0/2] PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock

From: Ionut Nechita (Wind River)

Date: Thu Mar 19 2026 - 16:38:54 EST


On Thu, 19 Mar 2026 13:31:39 +0100, Niklas Schnelle wrote:
> For your awareness, I saw that this series has some findings on
> Google's new Sashiko AI reviewing tool[0]. At a quick glance the
> findings seem like at least reasonable concerns to me. I'm still
> looking at this independently also of course.

Hi Niklas,

Thanks for pointing this out. I went through each of the Sashiko
findings carefully(+ AI). Here is my analysis:


1) Incomplete Deadlock Fix (Hierarchical Topology)
— remove_store on a bridge still has the AB-BA between
pci_rescan_remove_lock and device_lock(child)

This is correct, but it is a pre-existing issue, not introduced
by this series. The v10 cover letter explicitly acknowledges this:

"Note: the concurrent unbind_store + hotplug-event case (where
the hotplug handler takes pci_rescan_remove_lock before
device_lock) remains a known limitation."

and points to Benjamin Block's separate series that addresses the
broader hierarchical locking problem:
https://lore.kernel.org/linux-pci/354b9e4a54ced67f3c89df198041df19434fe4c8.1773235561.git.bblock@xxxxxxxxxxxxx/

Patch 2/2 fixes the specific deadlock reported by Guenter Roeck
(remove_store vs unbind_store on the same device). It does not
claim to solve all lock ordering issues in the PCI subsystem.
Trying to do so in a stable-targeted fix would be too invasive.


2) Driver Teardown Ordering Violation
— unbinding the bridge driver before its children causes PCIe
errors because pci_disable_device() clears Memory/IO Enable

This is a partially false positive. Sashiko assumes that
pci_disable_device() on a bridge disables forwarding of
transactions to child devices. It does not — pci_disable_device()
clears the bus master bit and decrements the enable count, but
does not touch the Memory Space Enable or I/O Space Enable bits
that control transaction forwarding on bridges. Child devices
can still access their MMIO regions through the bridge.

What does happen is that pcie_portdrv_remove() tears down port
services (AER, hotplug, PME, DPC) before the children are
unbound. This means error reporting is degraded during child
teardown, but it does not cause Unsupported Requests, Completer
Aborts, or system crashes as Sashiko claims.

Also worth noting: this scenario (remove_store on a bridge)
is not the path that was deadlocking. The reported deadlocks
were on the PF device itself, not on its parent bridge.


3) TOCTOU Race Condition / Lock Window Vulnerability
— a driver can rebind between device_release_driver() and
pci_stop_and_remove_bus_device_locked()

This is theoretically valid but practically impossible. The
window is a few instructions wide. For this race to trigger:

a) device_remove_file_self() has already removed the "remove"
sysfs attribute, signaling the device is being torn down
b) a bind_store or udev probe would need to fire in exactly
that window
c) the newly bound driver's probe() would need to call
pci_enable_sriov() and block on pci_rescan_remove_lock

This is the same pattern used elsewhere in the kernel (e.g.,
the existing remove_store already had no synchronization between
device_remove_file_self() and pci_stop_and_remove_bus_device_locked()
— the patch just adds one more call in between).

If this is a real concern, it would need to be addressed as a
separate improvement, not as a blocker for this fix.


4) Use-After-Free in remove_store
— device_remove_file_self() breaks active sysfs protection,
allowing concurrent device_del() to free dev

This is a false positive. kernfs_remove_self() does three things:

a) breaks active protection (decrements active ref)
b) calls __kernfs_remove(kn) to unlink the node
c) calls kernfs_unbreak_active_protection(kn) to restore the
active ref

After step (c), the active reference is restored. The sysfs
write handler (kernfs_fop_write_iter) still holds this active
reference for the duration of the store callback. A concurrent
device_del() calling kernfs_remove() on the parent directory
will call kernfs_drain() on all remaining children, which blocks
until active references drop to zero. Since remove_store still
holds the active ref (restored in step c), device_del() will
block until remove_store returns.

Additionally, pci_rescan_remove_lock serializes any concurrent
PCI device removal (hotplug events also take this lock), so
the scenario described by Sashiko cannot actually occur.

This is the same lifetime model that the existing remove_store
(without this patch) has always relied on.


In summary: finding 1 is a known pre-existing limitation documented
in the cover letter, findings 2 and 4 are false positives, and
finding 3 is a theoretical race that is not practically exploitable
and is not new to this patch.

I don't think a v11 is needed based on these findings. But I'm
happy to discuss further if you see something I missed in my
analysis.

Thanks,
Ionut