Re: [PATCH] vfs: document locking for mnt_notify_add()

From: Jann Horn

Date: Mon Jun 01 2026 - 10:34:36 EST


On Mon, Jun 1, 2026 at 3:04 PM Jan Kara <jack@xxxxxxx> wrote:
> On Fri 29-05-26 21:38:08, Jann Horn wrote:
> > The locking in mnt_notify_add(), which was introduced in commit
> > bf630c401641 ("vfs: add notifications for mount attach and detach"), is a
> > bit gnarly.
> > notify_list is protected by namespace_lock, but there are cases where
> > mnt_notify_add() is called without holding namespace_lock, for example:
> >
> > __do_sys_fsmount -> mnt_add_to_ns -> mnt_notify_add
> >
> > Luckily, in cases where the namespace_lock isn't held, the namespace is
> > always freshly created and can't have any fsnotify marks yet, which means
> > the notify_list isn't actually accessed.
> >
> > The existing comment claims that not accessing the notify_list in these
> > cases is merely an optimization, which is wrong. Fix the comment, and add a
> > locking assertion.
> >
> > To allow mnt_notify_add() to reference the namespace_sem, move it into
> > fs/namespace.c.
> >
> > Signed-off-by: Jann Horn <jannh@xxxxxxxxxx>
>
> Thanks for looking into this. Adding Miklos to CC who actually wrote that
> code. Also in cases like path_pivot_root() I don't think we hold the
> namespace_sem so the locking is even more complicated if it is correct at
> all. Miklos?

I was confused by that code too, but it turns out that actually does
hold the namespace_sem - it's just really hard to see because of
__cleanup() magic.

path_pivot_root() does LOCK_MOUNT(), which is a macro that expands to
LOCK_MOUNT_MAYBE_BENEATH(), which expands to something that calls
do_lock_mount() and does a `__cleanup(unlock_mount)` declaration.
do_lock_mount() does namespace_lock(), and on return from
path_pivot_root(), the cleanup handler unlock_mount() calls
__unlock_mount(), which calls namespace_unlock().