Re: [PATCH v3 0/2] cxl/region: Fix race conditions in cxl region unregistration.

From: Sungwoo Kim

Date: Tue Apr 28 2026 - 01:45:14 EST


On Mon, Apr 27, 2026 at 1:42 PM Dave Jiang <dave.jiang@xxxxxxxxx> wrote:
>
>
>
> On 4/26/26 8:20 PM, Sungwoo Kim wrote:
> > This version mainly addresses Dave's comments and meaningful issues reported by Sashiko AI[1].
> >
> > Overview
> > ========
> > This patch series fixes race conditions in cxl region unregistration.
> >
> > devm_release_action() should be called once, otherwise, it warns about
> > the second call. However, the current implementation has a race condition
> > that allows multiple calls to devm_release_action(). The details are in
> > each patch.
> >
> > To fix these, the first patch adds a new function that guarantees that
> > devm_release_action() is called only once.
> > Using this function, the second patch subsitutes the current use of
> > devm_release_action() in cxl region with the new function.
> >
> > Change in v3
> > ============
> > Addressed Dave's comments:
> > - Split and made this in a patch series.
> > - Enhanced a commit log explaining why a workqueue is used in the first patch.
> > - Added a context on how these issues were found and what impact was observed and how that effects a user in the second patch.
> > Sashiko AI review fixes:
> > - Fixed construct_region() as it also can race.
> > - Used a driver's wq instead of system wq so unbinding can drain a work.
>
> A few issues Sashiko raised with the v3.
> https://sashiko.dev/#/patchset/20260427032010.916681-2-iam%40sung-woo.kim

A Sashiko's reasoning makes sense to me. In the middle of
construct_region(), users can perform a sysfs write, which then calls
unregister_region().
unregister_region() must not be called before construct_region() has
completed, since it calls device_del() and put_device().

This isn't introduced by this patch, but I can fix it in v4.

An enable/disable flag to allow sysfs write might fix this, like:

static struct cxl_region *construct_region(...)
{
... // when a construction is done
set_bit(CXL_REGION_F_ENABLE_SYSFS);
return cxlr;
}

This prevents a sysfs write before it's done:

static ssize_t delete_region_store(...)
{
if (!test_bit(CXL_REGION_F_ENABLE_SYSFS)) {
return -EBUSY;
}
...
}

Would there be a better solution? I'd like to ask for comments on this.
This patch series already introduces two new flags, and I'm concerned
about adding another.

Thank you!
Sungwoo.

>
> DJ
>
>
> >
> > [1] https://sashiko.dev/#/patchset/20260422045637.3048249-2-iam%40sung-woo.kim
> >
> > Earlier approach:
> > v2: https://lore.kernel.org/linux-cxl/20260422045637.3048249-2-iam@xxxxxxxxxxxx/
> > v1: https://lore.kernel.org/linux-cxl/20260308185958.2453707-2-iam@xxxxxxxxxxxx/
> >
> > Sungwoo Kim (2):
> > cxl/region: serialize devm action removal via scheduled work
> > cxl/region: Fix a race bug in delete_region_store()
> >
> > drivers/cxl/core/port.c | 6 +++++
> > drivers/cxl/core/region.c | 47 ++++++++++++++++++++++++++++++++++-----
> > drivers/cxl/cxl.h | 9 ++++++++
> > 3 files changed, 57 insertions(+), 5 deletions(-)
> >
>
>