Re: [PATCH rc v2] iommu: Fix nested pci_dev_reset_iommu_prepare/done()

From: Nicolin Chen

Date: Thu Mar 19 2026 - 17:35:16 EST


On Thu, Mar 19, 2026 at 07:14:21PM +0800, Shuai Xue wrote:
> On 3/19/26 12:31 PM, Nicolin Chen wrote:
> > @@ -3961,9 +3962,10 @@ int pci_dev_reset_iommu_prepare(struct pci_dev *pdev)
> > guard(mutex)(&group->mutex);
> > - /* Re-entry is not allowed */
> > - if (WARN_ON(group->resetting_domain))
> > - return -EBUSY;
> > + if (group->resetting_domain) {
> > + group->reset_cnt++;
> > + return 0;
> > + }
> > ret = __iommu_group_alloc_blocking_domain(group);
>
> pci_dev_reset_iommu_prepare/done() have NO singleton group check.
> They operate on the specific pdev passed in, but use group-wide
> state (resetting_domain, reset_cnt) to track the reset lifecycle.
>
> Interestingly, the broken_worker in patch 3 of the ATC timeout
> series DOES have an explicit singleton check:
>
> if (list_is_singular(&group->devices)) {
> /* Note: only support group with a single device */
>
> This reveals an implicit assumption: the entire prepare/done
> mechanism works correctly only for singleton groups. For
> multi-device groups:
>
> - prepare() only detaches the specific pdev, leaving other
> devices in the group still attached to the original domain
> - The group-wide resetting_domain/reset_cnt state can be
> corrupted by concurrent resets on different devices (as
> discussed above)

That's a phenomenal catch!

I think we shall have a reset_ndevs in the group and a reset_depth
in the gdev.

> If prepare/done is truly meant only for singleton groups, it
> should enforce this explicitly:
>
> if (!list_is_singular(&group->devices))
> return -EOPNOTSUPP;

-EOPNOTSUPP will fail ail the PCI reset caller functions, though we
could change them to ignore -EOPNOTSUPP or we could probably do:
if (!list_is_singular(&group->devices))
return 0;

But I feel this would be too heavy for a bug fix.

> If it's meant to support multi-device groups, then the per-device
> vs group-wide state mismatch needs to be resolved — either by
> making the state per-device, or by detaching/restoring all
> devices in the group together.

Per-gdev should work, IMHO, at least for this patch.

For iommu_report_device_broken(), I am thinking we could still try
a per-gdev WQ, by changing the mutex-protected gdev list to an RCU
one.

Thanks
Nicolin