Re: [PATCH] cxl/region: Fix a race bug in delete_region_store

From: Dan Williams

Date: Tue Mar 31 2026 - 23:51:13 EST

Sungwoo Kim wrote:
> On Tue, Mar 10, 2026 at 6:54 PM Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> >
> > Sungwoo Kim wrote:
> > > Fix this by replacing devm_release_action() with devm_remove_action_nowarn()
> > > followed by a manual call to unregister_region(). devm_remove_action_nowarn()
> > > removes the devres tracking entry and returns an error code.
> >
> > No, that is not an acceptable or comprehensive fix. A subsystem should
> > never try to double unregister a device. Keep in mind the decoder sysfs
> > stays alive while the devres_release_all() actions are running for
> > port->uport_dev. As a result, yes, Davidlohr is right. You can
> > theoretically observe the unregister_region() release action firing
> > while delete_region_store() is running.
>
> I appreciate your review and suggestion of a new patch. It seems more
> complicated than I expected!

Yes, the CXL subsystem is unfortunately... complicated.

> I'm new to CXL and device drivers, so I would like to confirm that my
> understanding is correct.
> - CXL tolpology could be: [upstream] host -> switch(s) -> endpoints(s)
> [downstream].
> - The host can configure a unique address range (= region) to access
> each device.
> - The bug happened because releasing the same regions can race.
> - Although the prior patch handled this, it's insufficient because it
> can also race with releasing the device (what Davidlohr had reported).

The patch was technically correct but it relies on a design that
requires depending on a double free semantic. Apparently Rust needs it
to enforce a higher level semantic, but in C we need to handle that
semantic directly.

> > The only comprehensive fix I currently see to that problem is to indeed
> > get it back synchronized under the device lock. This prevents multiple
> > requesters from colliding, and from devres_release_all() deleting an
> > action that delete_region_store() already committed to handling.
> >
> > This looks something like schedule the devres action to be released in
> > workqueue under guard(device)(&port->udev).
>
> I assume you meant guard(device)(port->uport_dev).

Correct. When I say "something like" it almost always means "be careful
what I am about to say is full of bugs" so, good catch.

> > It is ugly, but it may be the case that this wants to synchronously
> > delete the region, and asynchronously cleanup the release action.
> >
> > I.e. unregister_region() grows a new:
> >
> > if (!test_and_set_bit(CXL_REGION_F_DELETE, ...)
> >
> > ...flag to enforce only one path successfully deletes.
> > delete_region_store() calls unregister_region() directly. Then the
>
> To fix this, you suggested:
> - Add a flag to make unregister_region() idempotent.
> - Make delete_region_store() call unregister_region() directly.
>
> Now unregister_region() is race-free. But we need to release an action.
>
> > workqueue's job is to trigger the release action under the lock only if
> > port->uport_dev is still driver-attached by the time the workqueue runs.
> >
>
> What I'm missing is this part.
> How could asynchronous cleanup solve the race? In the worst case, the
> workqueue could start immediately after we schedule it. In this case,
> it's the same as the synchronous execution.

Asynchronous cleanup gets us into a context where we are safe to acquire
locks. delete_region_store can not hold guard(device)(port->uport_dev).
You can try it to see the full lockdep splat. It is setup by fact that
unregistering the root decoders is performed under that lock.
Unregistering root decoder flushes userspace out of sysfs attribute
handlers. If those are holding the lock themselves it is an ABBA
deadlock.

I circled back to this because the CXL accelerator patches also have a
reason to autocleanup regions so this would be another source of
cxl_region deletion to rendezvous.