Re: [PATCH] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers

From: Ryusuke Konishi

Date: Wed Apr 29 2026 - 08:35:59 EST


On Wed, Apr 29, 2026 at 10:50 AM Deepanshu Kartikey wrote:
>
> On Wed, Apr 29, 2026 at 12:29 AM Viacheslav Dubeyko wrote:
> >
> > On Tue, 2026-04-28 at 09:32 +0530, Deepanshu Kartikey wrote:
> > > Syzbot reported a hung task in nilfs_transaction_begin() where multiple
> > > tasks performing chmod() on a nilfs2 mount blocked for over 143 seconds
> > > waiting to acquire ns_segctor_sem for read:
> > >
> > > INFO: task syz.0.17:5918 blocked for more than 143 seconds.
> > > Call Trace:
> > > schedule+0x164/0x360
> > > rwsem_down_read_slowpath+0x6d9/0x940
> > > down_read+0x99/0x2e0
> > > nilfs_transaction_begin+0x364/0x710 fs/nilfs2/segment.c:221
> > > nilfs_setattr+0x124/0x2c0 fs/nilfs2/inode.c:921
> > > notify_change+0xc1a/0xf40
> > > chmod_common+0x273/0x4a0
> > > do_fchmodat+0x12d/0x230
> > >
> > > The writer holding ns_segctor_sem was a concurrent NILFS_IOCTL_CLEAN_SEGMENTS
> > > caller, stuck inside printk while emitting per-element warnings from
> > > nilfs_sufile_updatev():
> > >
> > > __nilfs_msg+0x373/0x450 fs/nilfs2/super.c:78
> > > nilfs_sufile_updatev+0x21c/0x6d0 fs/nilfs2/sufile.c:186
> > > nilfs_sufile_freev fs/nilfs2/sufile.h:93 [inline]
> > > nilfs_free_segments fs/nilfs2/segment.c:1140 [inline]
> > > nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1261 [inline]
> > > nilfs_segctor_do_construct+0x1f55/0x76c0
> > > nilfs_clean_segments+0x3bd/0xa50
> > > nilfs_ioctl_clean_segments fs/nilfs2/ioctl.c:922 [inline]
> > > nilfs_ioctl+0x261f/0x2780
> > >
> > > The root cause is that nilfs_ioctl_clean_segments() does not validate
> > > the user-supplied segment numbers in kbufs[4] before calling
> > > nilfs_clean_segments(), which acquires ns_segctor_sem for write. The
> > > range check on each segnum is performed deep inside the call chain by
> > > nilfs_sufile_updatev(), which emits a nilfs_warn() per invalid entry
> > > while still under the segctor lock and the sufile mi_sem. Under load
> > > (repeated invocations across multiple mounts saturating the global
> > > printk path), the cumulative printk latency keeps ns_segctor_sem held
> > > long enough to trip the hung_task watchdog, blocking concurrent
> > > operations such as chmod() that need ns_segctor_sem for read.
> > >
> > > Fix by validating the contents of kbufs[4] in the ioctl entry path,
> > > before any FS-wide lock is acquired. Out-of-range segment numbers are
> > > rejected with -EINVAL synchronously, with no work performed under
> > > ns_segctor_sem.
> > >
> > > Reported-by: syzbot+62f0f99d2f2bb8e3bbd7@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > Closes: https://syzkaller.appspot.com/bug?extid=62f0f99d2f2bb8e3bbd7
> > > Fixes: 4f6b828837b4 ("nilfs2: fix lock order reversal in nilfs_clean_segments ioctl")
> > > Tested-by: syzbot+62f0f99d2f2bb8e3bbd7@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > Signed-off-by: Deepanshu Kartikey <kartikey406@xxxxxxxxx>
> > > ---
> > > fs/nilfs2/ioctl.c | 16 ++++++++++++++++
> > > 1 file changed, 16 insertions(+)
> > >
> > > diff --git a/fs/nilfs2/ioctl.c b/fs/nilfs2/ioctl.c
> > > index e0a606643e87..38822dce1839 100644
> > > --- a/fs/nilfs2/ioctl.c
> > > +++ b/fs/nilfs2/ioctl.c
> > > @@ -846,6 +846,7 @@ static int nilfs_ioctl_clean_segments(struct inode *inode, struct file *filp,
> > > struct the_nilfs *nilfs;
> > > size_t len, nsegs;
> > > int n, ret;
> > > + size_t i;
> >
> > What about re-using the n variable? Does it make sense to introduce new one?
> >
> > >
> > > if (!capable(CAP_SYS_ADMIN))
> > > return -EPERM;
> > > @@ -876,6 +877,21 @@ static int nilfs_ioctl_clean_segments(struct inode *inode, struct file *filp,
> > > }
> > > nilfs = inode->i_sb->s_fs_info;
> > >
> > > + /*
> > > + * Validate segment numbers against the filesystem's segment count
> > > + * before entering nilfs_clean_segments(), which acquires
> > > + * ns_segctor_sem for write. Catching invalid segnums here avoids
> > > + * holding that lock while emitting per-element diagnostics under
> > > + * the segment constructor.
> > > + */
> > > + for (i = 0; i < nsegs; i++) {
> > > + if (((__u64 *)kbufs[4])[i] >= nilfs->ns_nsegments) {
> > > + ret = -EINVAL;
> > > + kfree(kbufs[4]);
> > > + goto out;
> >
> > Are you sure that you need to free buffer here and go to out? Maybe, we can
> > introduce another label and to jump to kfree(kbufs[4]) at the end of method?
> >
> > Thanks,
> > Slava.
> >
> > > + }
> > > + }
> > > +
> > > for (n = 0; n < 4; n++) {
> > > ret = -EINVAL;
> > > if (argv[n].v_size != argsz[n])
> >
>
> Thanks for the feedback. I have sent patch v2.
>
> Thanks
>
> Deepanshu Kartikey

Thank you, Deepanshu, for the patch proposal.

Because nilfs->ns_nsegments can be modified by nilfs_ioctl_resize(),
we must avoid race conditions regarding this proposed fix.

Therefore, it's appropriate to insert this check within the write lock
section of nilfs->ns_segctor_sem, that is, immediately after calling
nilfs_transaction_lock() within nilfs_clean_segments().

As a coding comment, directly referencing kbufs[4] as an array is not
very readable, so it's better to declare a variable in the local
variable declaration section of nilfs_clean_segments() like this:

size_t i, nsegs = argv[4].v_nmembs;
__u64 *segnumv = kbufs[4];

and then compare by referencing segnumv[i].
Here, the original variable name "nsegs" is confusingly similar to
"ns_nsegments", therefore, it would be better to simply change it to
"n" or rename it to something that more concisely represents the
number of segments in the array.

Also, to help identify the error's cause, I recommend adding an error
message like the following within the pre-check loop:

nilfs_err(sb,
"Segment number %llu to be freed is out of range",
(unsigned long long)segnumv[i]);

Finally, a minor comment: to avoid scattering the function's exit path
when making corrections according to the above policy, it's better to
add a label like the following to nilfs_clean_segments() and jump to
it, rather than returning separately.

out_unlock:
...

bail_unlock:
nilfs_transaction_unlock(sb);
return err;

Thanks,
Ryusuke Konishi