Re: [PATCH v4 18/24] iommu/arm-smmu-v3: Introduce master->ats_broken flag
From: Nicolin Chen
Date: Tue Jun 02 2026 - 01:48:41 EST
On Mon, Jun 01, 2026 at 09:15:47PM -0300, Jason Gunthorpe wrote:
> On Mon, Jun 01, 2026 at 01:41:26PM -0700, Nicolin Chen wrote:
> > On Mon, Jun 01, 2026 at 09:32:31AM -0300, Jason Gunthorpe wrote:
> > > On Fri, May 29, 2026 at 06:27:40PM -0700, Nicolin Chen wrote:
> > > > On Tue, May 19, 2026 at 09:06:58AM -0300, Jason Gunthorpe wrote:
> > > > > On Mon, May 18, 2026 at 08:39:01PM -0700, Nicolin Chen wrote:
> > > > So I've tried INV_TYPE_ATS_BROKEN: during per-domain invalidation,
> > > > each batch is built from domain->invs so it can carry the "invs";
> > > > if the batch times out, we can immediately mutate its ATS entries.
> > > >
> > > > But I realized a limitation. E.g., if a device attaches to two SVA
> > > > domains on two SSIDs. An invalidation timing out on one of the SVA
> > > > domains could mark INV_TYPE_ATS_BROKEN in its own invs, but not in
> > > > the other SVA domain's invs?
> > >
> > > You'd have to mark all the S1's sharing the STE.
> >
> > That would be a bit convoluted as we would have to go through all
> > other domains' invs arrays.
>
> Ok, that is certainly an annoying problem.
>
> I don't have a better idea than storing the master unfortunately
>
> But I think the locking for that is going to be tricky, I'm not sure it does
> actually fully work..
Yes, there can be a race that sets STE.EATS back while per-master
flag is set, which would skip the ATC_INV in commit(), so no more
ATC_INV timeout that resets STE.EATS=0. To close it, we can force
STE.EATS=0 at the end of commit() when state->ats_enabled and the
per-master flag are both set, which is only possible in a race.
However, after more thought, I found that there's a big trade-off
to pay in the invalidation path, if we go on this path. The invs
array only has SID, so it needs to convert it to master using the
rb_tree for every entry (holding streams_lock).
I asked AI to list and compare the three solutions that we have:
(1) Per-invs marker: INV_TYPE_ATS_BROKEN + master_domains
disable_ats() in the timeout path walks master->master_domains
and flips matching ATS invs entries to the BROKEN type.
+ invs walker is free (one case label in the existing type switch).
+ No lock or pointer deref in the invs walker.
+ No master pointer stored in invs; no lifetime concern.
- disable_ats() walks every (master, domain) and marks each invs
set; the list needs locking usable from atomic.
- No master-level flag, so atc_inv_master loses the cheap early
skip and attach has nothing to gate ats_enabled on. Concurrent
quarantine self-heals only after the next ATC_INV times out (~1s),
unless we also keep a master flag (i.e. a hybrid of (1) and (2)).
(2) Per-master flag + streams_lock
invs walker resolves SID -> master via streams_lock and reads
master->ats_broken.
+ Single source of truth on the master.
+ disable_ats() is one WRITE_ONCE.
+ atc_inv_master early-skips via one READ_ONCE.
+ attach gates ats_enabled on the flag; a concurrent quarantine
race can be closed by a short post-attach re-check in commit()
+ No master pointer in invs; no lifetime concern.
- invs walker pays streams_lock + rb_find(SID) per ATS entry on
every invalidation. Measurable on ATS-heavy workloads.
(3) Per-master flag + inv->master pointer (v4)
invs entry carries a master pointer; the invs walker reads
cur->master->ats_broken directly.
+ invs walker is one READ_ONCE through a cached pointer.
+ disable_ats is one WRITE_ONCE.
+ atc_inv_master early-skip via one READ_ONCE.
+ attach gate + post-attach re-check, same as (2).
- invs holds a master ptr, so release_device must synchronize_rcu()
before freeing the master to drain walkers under rcu_read_lock().
We dropped this from v4 for that reason.
Side-by-side:
(1) (2) (3)
invs walker fastest slow fast
disable_ats slow fast fast
atc_inv skip slow fast fast
attach race slow fast fast
master free fast fast slow
FWIW, AI picks (3), but this would reintroduce synchronize_rcu()
that you dislike. My personal feeling is that (1) is the cleanest
shirt in the dirty laundry?
I'd like to hear your thought before finalizing the design.
Thanks
Nicolin