[PATCH v2 1/9] nfsd: fix BUG_ON in nfsd4_alloc_layout_stateid on racing delegation revoke
From: Jeff Layton
Date: Sat May 30 2026 - 09:19:48 EST
nfsd4_alloc_layout_stateid reads fp->fi_deleg_file without holding
fi_lock when the parent stateid is a delegation. A concurrent delegation
revoke via the laundromat can clear fi_deleg_file under fi_lock, causing
nfsd_file_get() to return NULL and triggering the BUG_ON.
This race is client-reachable: two NFS clients can trigger it by having
one hold a delegation while another opens the same file to force a
recall. When the first client doesn't respond to the recall, the
laundromat revokes it. A concurrent LAYOUTGET from any client using the
delegation stateid hits the race window.
Fix this by taking the rcu_read_lock() around the fi_deleg_file read in
the SC_TYPE_DELEG path, and replacing the BUG_ON with a graceful error
return that cleans up the partially-initialized layout stateid.
Fixes: c5c707f96fc9 ("nfsd: implement pNFS layout recalls")
Assisted-by: kres:claude-opus-4-7
Reported-by: Chris Mason <clm@xxxxxxxx>
Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
---
fs/nfsd/nfs4layouts.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
index 9ed2e3d65062..6c4e4fdd6c05 100644
--- a/fs/nfsd/nfs4layouts.c
+++ b/fs/nfsd/nfs4layouts.c
@@ -247,11 +247,17 @@ nfsd4_alloc_layout_stateid(struct nfsd4_compound_state *cstate,
nfsd4_init_cb(&ls->ls_recall, clp, &nfsd4_cb_layout_ops,
NFSPROC4_CLNT_CB_LAYOUT);
- if (parent->sc_type == SC_TYPE_DELEG)
- ls->ls_file = nfsd_file_get(rcu_dereference_protected(fp->fi_deleg_file, 1));
- else
+ if (parent->sc_type == SC_TYPE_DELEG) {
+ rcu_read_lock();
+ ls->ls_file = nfsd_file_get(rcu_dereference(fp->fi_deleg_file));
+ rcu_read_unlock();
+ } else {
ls->ls_file = find_any_file(fp);
- BUG_ON(!ls->ls_file);
+ }
+ if (!ls->ls_file) {
+ nfs4_put_stid(stp);
+ return NULL;
+ }
ls->ls_fenced = false;
ls->ls_fence_delay = 0;
--
2.54.0