Re: [PATCH v10 0/2] PCI/IOV: Fix SR-IOV locking races and AB-BA deadlock

From: Guenter Roeck

Date: Thu Mar 19 2026 - 15:25:57 EST


On 3/19/26 05:31, Niklas Schnelle wrote:
On Wed, 2026-03-18 at 23:03 +0200, Ionut Nechita (Wind River) wrote:
From: Ionut Nechita <ionut.nechita@xxxxxxxxxxxxx>

Hi Bjorn,

This is v10 of the fix for the SR-IOV race between driver .remove()
and concurrent hotplug events. v10 adds a second patch to fix the
AB-BA deadlock between device_lock and pci_rescan_remove_lock that
was reported by Guenter Roeck (via Google's AI review agent) and
confirmed by Benjamin Block.

The AB-BA deadlock:

CPU0 (remove_store) CPU1 (unbind_store)
-------------------- --------------------
pci_lock_rescan_remove()
device_lock()
driver .remove()
sriov_del_vfs()
pci_lock_rescan_remove() <-- WAITS
pci_stop_bus_device()
device_release_driver()
device_lock() <-- WAITS

Patch 2/2 fixes this by calling device_release_driver() in
remove_store() before pci_stop_and_remove_bus_device_locked(), so
that the driver is already unbound when pci_rescan_remove_lock is
acquired. Both paths then take locks in the same order: device_lock
first, then pci_rescan_remove_lock.

Note: the concurrent unbind_store + hotplug-event case (where the
hotplug handler takes pci_rescan_remove_lock before device_lock)
remains a known limitation. This is a pre-existing issue that
Benjamin Block is addressing separately in:
https://lore.kernel.org/linux-pci/354b9e4a54ced67f3c89df198041df19434fe4c8.1773235561.git.bblock@xxxxxxxxxxxxx/

--- snip ---

Ionut Nechita (2):
PCI/IOV: Make pci_lock_rescan_remove() reentrant and protect
sriov_add_vfs/sriov_del_vfs
PCI: Fix AB-BA deadlock between device_lock and
pci_rescan_remove_lock in remove_store

drivers/pci/iov.c | 9 +++++----
drivers/pci/pci-sysfs.c | 20 +++++++++++++++++++-
drivers/pci/probe.c | 11 +++++++++--
3 files changed, 33 insertions(+), 7 deletions(-)

--
2.43.0

Hi Ionut,

For your awareness, I saw that this series has some findings on
Google's new Sashiko AI reviewing tool[0]. At a quick glance the
findings seem like at least reasonable concerns to me. I'm still
looking at this independently also of course.


It is almost scary to see how many problems Sashiko is able to find.
The AB-BA deadlock that the second patch in the series tries to fix
was reported by a prototype version of it when running it on an LTS
backport.

Guenter