Re: [PATCH v6 4/9] PCI/CXL: Add sibling function coordination for reset

From: Dan Williams (nvidia)

Date: Wed Jun 03 2026 - 23:13:19 EST


Srirangan Madhavan wrote:
> Add helpers to collect CXL sibling PCI functions affected by a CXL reset
> and prepare them for reset by saving and disabling them. Restore those
> siblings and drop their references when reset coordination completes.
>
> Use the Non-CXL Function Map DVSEC to exclude non-CXL functions, and
> filter remaining siblings to functions that advertise CXL.cache or
> CXL.mem capability.
>
> Use pci_dev_trylock() for sibling locking and unwind on contention or
> allocation failure, so competing reset paths fail with an errno.

This is a pile of code just to precisely save and restore only the
functions impacted by the reset. What is not clear to me is what is the
cost of over saving and restoring. The specification seems to imply that
CXL Reset has the same effect as FLR as far as CXL.io is concerned.
Which could maybe be read as all functions that speak CXL.io (all of
them) see the reset even if only a subset participate in CXL.cachemem.

Otherwise, there is a good chance that the pci_dev_reset_iommu_prepare()
is all going to all apply to the same iommu group for this device.

pci_dev_save_and_disable(sibling);
rc = pci_dev_reset_iommu_prepare(sibling);

...maybe the simple thing to do is just treat this like slot reset and
use the existing method of walking the device list by matching slot to
save and disable every function on the device. In other words, it is not
clear that the precision of saving some extra save_and_disable cycles is
worth it.