Re: [PATCH v2 7/7] iommu/arm-smmu-v3: Block ATS upon an ATC invalidation timeout
From: Shuai Xue
Date: Thu Mar 19 2026 - 03:41:57 EST
On 3/19/26 11:26 AM, Nicolin Chen wrote:
On Thu, Mar 19, 2026 at 10:56:43AM +0800, Shuai Xue wrote:
On 3/18/26 3:15 AM, Nicolin Chen wrote:
For batched ATC_INV commands, SMMU hardware only reports a timeout at the
CMD_SYNC, which could follow the batch issued for multiple devices. So, it
isn't straightforward to identify which command in a batch resulted in the
timeout. Fortunately, the invs array has a sorted list of ATC entries. So,
the issued batch must be sorted as well. This makes it possible to bisect
the batch to retry the command per Stream ID and identify the master.
Nit: The implementation is a linear per-SID retry, not a binary
search / bisection. Suggest rewording to:
"retry the ATC_INV command for each unique Stream ID in the batch
to identify the unresponsive master"
You are right. And that sounds OK.
+ step = arm_smmu_get_step_for_sid(smmu, sid);
+ WRITE_ONCE(step->data[1],
+ READ_ONCE(step->data[1]) & cpu_to_le64(~STRTAB_STE_1_EATS));
This non-atomic read-modify-write on step->data[1] can race with the
normal STE installation path (arm_smmu_write_entry → entry_set →
WRITE_ONCE).
The error path runs from:
__arm_smmu_domain_inv_range() (data path, no group->mutex)
→ arm_smmu_cmdq_batch_retry()
→ arm_smmu_master_disable_ats()
→ arm_smmu_disable_eats_for_sid() ← NO locks on STE
The normal STE path runs from:
iommu_attach_device()
→ mutex_lock(&group->mutex)
→ arm_smmu_attach_dev()
→ mutex_lock(&arm_smmu_asid_lock)
→ arm_smmu_install_ste_for_dev()
→ arm_smmu_write_entry() ← holds both mutexes
Since the error path holds neither group->mutex nor arm_smmu_asid_lock,
the following race is possible:
Because invalidations can be in atomic context so we can't hold
those mutex locks.
CPU A (error path): CPU B (attach path):
READ data[1] = X
WRITE data[1] = Y (new STE config)
WRITE data[1] = X & ~EATS
// Y is lost
This could clobber a concurrent STE update from the attach path.
Oh, that's true. Maybe this:
__le64 new, old = READ_ONCE(step->data[1]);
[...]
do {
new = old & cpu_to_le64(~STRTAB_STE_1_EATS);
} while (!try_cmpxchg64(&step->data[1], &old, new));
?
Yes, the cmpxchg loop looks correct to me.
Thanks.
Shuai