Re: [PATCH V3 7/9] dax: fix holder_ops race in fs_put_dax()

From: John Groves

Date: Sun Jun 07 2026 - 11:38:49 EST


On 26/06/01 05:03PM, Dave Jiang wrote:
>
>
> On 5/30/26 9:51 AM, John Groves wrote:
> > From: John Groves <John@xxxxxxxxxx>
> >
> > Clear holder_ops before holder_data so that a concurrent fs_dax_get()
> > cannot have its newly installed holder_ops overwritten. cmpxchg()
> > provides release ordering on weakly-ordered architectures, ensuring the
> > WRITE_ONCE(holder_ops, NULL) store is visible to any CPU that observes
> > the holder_data release.
> >
> > Add WARN_ON() on the cmpxchg result to catch two API contract
> > violations: fs_put_dax() called by a non-holder, or called twice by
> > the same holder (double-put). Either way holder_ops has already been
> > cleared, so WARN_ON() does not prevent the damage but makes the bug
> > visible. (Note: "damage" is only if a non-holder causes holder_ops
> > to be cleared)
> >
> > Also add a kerneldoc comment documenting that fs_put_dax() must only
> > be called by the current holder.
> >
> > Fixes: eec38f5d86d27 ("dax: Add fs_dax_get() func to prepare dax for fs-dax usage")
> > Signed-off-by: John Groves <john@xxxxxxxxxx>
> > ---
> > drivers/dax/super.c | 35 ++++++++++++++++++++++++++++++++---
> > 1 file changed, 32 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> > index 25cf99dd9360b..4c56ac2faacdb 100644
> > --- a/drivers/dax/super.c
> > +++ b/drivers/dax/super.c
> > @@ -116,11 +116,40 @@ EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev);
> >
> > #if IS_ENABLED(CONFIG_FS_DAX)
> >
> > +/**
> > + * fs_put_dax() - release holder ownership of a dax_device
> > + * @dax_dev: dax device to release (may be NULL)
> > + * @holder: the holder pointer previously passed to fs_dax_get() or
> > + * fs_dax_get_by_bdev(); must match exactly, as it is used
> > + * in a cmpxchg to atomically release ownership
> > + *
> > + * Must only be called by the current holder. Clears holder_ops before
> > + * holder_data to avoid a race where a concurrent fs_dax_get() could have
> > + * its newly installed holder_ops overwritten.
> > + */
> > void fs_put_dax(struct dax_device *dax_dev, void *holder)
> > {
> > - if (dax_dev && holder &&
> > - cmpxchg(&dax_dev->holder_data, holder, NULL) == holder)
> > - dax_dev->holder_ops = NULL;
> > + if (dax_dev && holder) {
> > + /*
> > + * Clear holder_ops before releasing holder_data. A concurrent
> > + * dax_holder_notify_failure() that sees NULL ops returns
> > + * -EOPNOTSUPP cleanly. A concurrent fs_dax_get() that acquires
> > + * holder_data after the cmpxchg below is guaranteed to observe
> > + * holder_ops=NULL first (cmpxchg provides release ordering), so
> > + * its subsequent store of new ops will not be overwritten.
> > + *
> > + * Two cases will trigger the WARN_ON():
> > + * - Caller is not the current holder; this is an API contract
> > + * violation, and the holder will no longer get callbacks
> > + * - Holder calls this function twice; also a contract violation
> > + *
> > + * A lock would be necessary to guard against the contract
> > + * violations, but we WARN_ON() instead since violating the
> > + * contract is a bug
> > + */
> > + WRITE_ONCE(dax_dev->holder_ops, NULL);
> > + WARN_ON(cmpxchg(&dax_dev->holder_data, holder, NULL) != holder);
> > + }
> > put_dax(dax_dev);
> > }
> > EXPORT_SYMBOL_GPL(fs_put_dax);
>
>
> This is what Claude Opus 4.8 said:
>
> The added WARN_ON(cmpxchg(...) != holder) fires on the supported
> device-removal-while-mounted path. kill_dax() (super.c:457) clears holder_data
> = NULL while a holder is still attached — it explicitly tests holder_data !=
> NULL to deliver MF_MEM_PRE_REMOVE first. For xfs on pmem:
>
> 1. pmem_remove() → kill_dax() → MF_MEM_PRE_REMOVE →
> xfs_force_shutdown(SHUTDOWN_FORCE_UMOUNT); the handler does not call
> fs_put_dax. kill_dax then clears holder_data.
> 2. Forced unmount → xfs_free_buftarg() → fs_put_dax(bt_daxdev, mp).
> 3. cmpxchg(&holder_data, mp, NULL) returns NULL (already cleared) != mp → WARN
> fires, despite xfs being the legitimate holder doing a single put.
>
> The old == holder form skipped silently in this case. On panic_on_warn systems
> this turns a supported device removal into a panic.
>
> The commit message's claim that the WARN catches only "non-holder" or
> "double-put" contract violations is incomplete — it also catches the holder
> racing with kill_dax(), which is not a contract violation.
>
> This is the suggested fix:
> void fs_put_dax(struct dax_device *dax_dev, void *holder)
> {
> if (dax_dev && holder) {
> void *prev;
>
> /*
> * Clear holder_ops before releasing holder_data so a
> * concurrent fs_dax_get() that wins holder_data observes
> * holder_ops == NULL and its store is not overwritten.
> */
> WRITE_ONCE(dax_dev->holder_ops, NULL);
> prev = cmpxchg(&dax_dev->holder_data, holder, NULL);
>
> /*
> * prev == holder: normal release.
> * prev == NULL: already released by kill_dax() when the
> * device was removed under a live holder;
> * not a bug.
> * prev != holder (non-NULL): fs_put_dax() called by something
> * that is not the current holder.
> */
> WARN_ON(prev && prev != holder);
> }
> put_dax(dax_dev);
> }
>
>

Looks good - going with this approach.

Thanks!
John