Re: [PATCH] iommu: Fix bypass of IOMMU readiness check for multi-IOMMU devices
From: Tudor Ambarus
Date: Mon Mar 23 2026 - 13:23:22 EST
Hi, Jason,
On 3/23/26 3:54 PM, Jason Gunthorpe wrote:
> On Mon, Mar 23, 2026 at 01:09:27PM +0000, Tudor Ambarus wrote:
>> Commit da33e87bd2bf ("iommu: Handle yet another race around
>> registration") introduced a readiness check in `iommu_fwspec_init()` to
>> prevent client drivers from configuring their IOMMUs before
>> `bus_iommu_probe()` has completed.
>>
>> To optimize the replay path, the readiness check was conditionally
>> gated behind `!dev->iommu`:
>> if (!dev->iommu && !READ_ONCE(iommu->ready))
>> return -EPROBE_DEFER;
>>
>> However, this assumption breaks down for devices that map to multiple
>> IOMMU instances.
>
> ?? We don't directly support "multiple IOMMU instances". There is only
> one dev->iommu.
>
> AFAIK if some drivers need to support multiple different instances of
> the same IOMMU driver they must deal with this fully internally and
> present to the core a "single instance" view.
Thanks for the quick answer. I may miss a few things, I should have
marked this as an RFC. Would you please help me understand a little bit
Downstream we have a display controller that's using:
iommus = <&sysmmu_19840000>, <&sysmmu_19c40000>;
These are 2 distinct platform devices, they probe independently, they
each call iommu_device_register() independently.
If I understood you correctly, the downstream driver shall model its
architecture and call iommu_device_register() only once after both
devices are configured.
My downstream reality is different. Here's what I'm encountering:
1/ sysmmu_19840000: dev->iommu is NULL. iommu_fwspec_init() correctly
evaluates !READ_ONCE(sysmmu_19840000->ready). Assuming it is ready,
it allocates dev->iommu.
2/ dev->iommu is now NOT NULL. iommu_fwspec_init() is called for the
second physical instance.
3/ Because of the !dev->iommu gate, the evaluation of
!READ_ONCE(sysmmu_19c40000->ready) is short-circuited and skipped
entirely.
But sysmmu_19c40000 is not ready, its specific bus_iommu_probe() is
executing asynchronously on another CPU.
If the core's intent is to strictly enforce a single IOMMU instance,
shouldn't iommu_fwspec_init() be checking
fwspec->iommu_fwnode == iommu_fwnode
instead of matching the ops? Because the core currently matches on
ops, it permits aggregating multiple physical instances with the
same ops into one fwspec.
Thanks a ton!
ta
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2940,7 +2940,7 @@ int iommu_fwspec_init(struct device *dev, struct fwnode_handle *iommu_fwnode)
return -EPROBE_DEFER;
if (fwspec)
- return iommu->ops == iommu_fwspec_ops(fwspec) ? 0 : -EINVAL;
+ return fwspec->iommu_fwnode == iommu_fwnode ? 0 : -EINVAL;
if (!dev_iommu_get(dev))
return -ENOMEM;