Re: [PATCH v2] fanotify: report thread pidfds for FAN_REPORT_TID
From: Jan Kara
Date: Mon Jun 01 2026 - 05:18:13 EST
On Fri 29-05-26 09:21:55, Amir Goldstein wrote:
> On Fri, May 29, 2026 at 4:01 AM AnonymeMeow <anonymemeow@xxxxxxxxx> wrote:
> >
> > The FAN_REPORT_PIDFD and FAN_REPORT_TID flags used to be mutually
> > exclusive because by the time the pidfd support was introduced to
> > fanotify, pidfds could only be created for thread group leaders. Now
> > that the pidfd API supports thread-specific pidfds via PIDFD_THREAD,
> > this restriction can be lifted.
> >
> > Also drop the pid_has_task() check to allow pidfds to be reported for
> > reaped tasks as well.
>
> I am not objecting to this change, but as a rule of thumb, almost
> every time that
> a commit message says "Also" it means that it should have been a separate
> commit.
>
> This is not an exception to the rule and even worse -
> This change PIDFD_STALE is an independent change of behavior
> (as observed by breakage of fanotify21) that could (maybe unlikely)
> surprise real users.
>
> We have few options:
> 1. Make it opt-in with yet another fanotify_init flag
> 2. Enable it implicitly together (and in the same patch) with FAN_REPORT_TID
> 3. Enable it without opt-in (as you did) but as separate patch
> that will be reverted if someone complains on "real world"
> (i.e. not test) breakage
>
> IMO, the path of least resistance is option 2.
> If users want to get the pidfd of tgid I suppose there should be no
> problem to get it from the pidfd of the thread. Right?
> So what FAN_REPORT_TID really means is give me a richer pidfd
> with more information about the actor than without FAN_REPORT_TID.
>
> I'd like to get Jan's input on those UAPI choices.
In this case I agree with Christian. I don't think there's a reason to
complicate the API so I'd go for 3 - I totally agree we want this in a
separate commit to have the change of behavior clearly documented and easy
to revert. I just think the pidfd could always be stale by the time you get
to use it so I think chances for unexpected userspace confusion are
limited.
Honza
> > Link: https://lore.kernel.org/lkml/20260528-schmuckvoll-heilen-garen-be77b4208671@brauner/
> > Signed-off-by: AnonymeMeow <anonymemeow@xxxxxxxxx>
> > ---
> >
> > On 2026-05-28 13:51 +0200, Christian Brauner wrote:
> > > For quite a while the kernel refused to hand out pidfds for reaped
> > > processes even if the struct pid was pinned like in this case.
> > >
> > > But that makes various APIs - including this one - way less powerful
> > > than they can be. Nowadays the socket layer already hands out pidfds for
> > > reaped processes. It also stashed the struct pid. Let's do the same
> > > here.
> > >
> > > Drop the pid_has_task() change and then:
> > >
> > > pidfd = pidfd_prepare(event->pid, pidfd_flags | PIDFD_STALE, &pidfd_file);
> > >
> > > which instructs pidfs to and out a pidfd even if the task has already
> > > been reaped. Reaped pidfds can still be queried for various types of
> > > information that is kept around even if the task is long gone.
> >
> > Thanks for the review. I've updated this in v2 as you suggested.
> >
> > Also, the LTP fanotify21 test currently explicitly expects ESRCH or
> > FAN_NOPIDFD for exited processes. I will update that test case accordingly.
>
> This sentence highlights the problematic aspect of this change.
> How would you update the test expectation?
> The test must be able to run on upstream as well as stable kernels.
> Would you test for support in FAN_REPORT_TID
> and according to the result determine the expected behavior of the
> test cases (variants) without FAN_REPORT_TID?
> Very ugly IMO.
>
> OTH, if you take the suggestion that PIDFD_STALE is implied only
> for FAN_REPORT_TID, you can change the expectation for the
> "terminated child" test case only for TST_VARIANT_PIDFD_THREAD.
>
> Thanks,
> Amir.
>
> >
> > With Best Regards,
> > AnonymeMeow
> >
> > ---
> > fs/notify/fanotify/fanotify_user.c | 33 +++++++-----------------------
> > 1 file changed, 7 insertions(+), 26 deletions(-)
> >
> > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> > index ae904451dfc0..b604e3da58ad 100644
> > --- a/fs/notify/fanotify/fanotify_user.c
> > +++ b/fs/notify/fanotify/fanotify_user.c
> > @@ -19,6 +19,7 @@
> > #include <linux/memcontrol.h>
> > #include <linux/statfs.h>
> > #include <linux/exportfs.h>
> > +#include <linux/pidfd.h>
> >
> > #include <asm/ioctls.h>
> >
> > @@ -903,25 +904,13 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
> > metadata.fd = fd >= 0 ? fd : FAN_NOFD;
> >
> > if (pidfd_mode) {
> > - /*
> > - * Complain if the FAN_REPORT_PIDFD and FAN_REPORT_TID mutual
> > - * exclusion is ever lifted. At the time of incoporating pidfd
> > - * support within fanotify, the pidfd API only supported the
> > - * creation of pidfds for thread-group leaders.
> > - */
> > - WARN_ON_ONCE(FAN_GROUP_FLAG(group, FAN_REPORT_TID));
> > + unsigned int pidfd_flags = PIDFD_STALE;
> >
> > - /*
> > - * The PIDTYPE_TGID check for an event->pid is performed
> > - * preemptively in an attempt to catch out cases where the event
> > - * listener reads events after the event generating process has
> > - * already terminated. Depending on flag FAN_REPORT_FD_ERROR,
> > - * report either -ESRCH or FAN_NOPIDFD to the event listener in
> > - * those cases with all other pidfd creation errors reported as
> > - * the error code itself or as FAN_EPIDFD.
> > - */
> > - if (metadata.pid && pid_has_task(event->pid, PIDTYPE_TGID))
> > - pidfd = pidfd_prepare(event->pid, 0, &pidfd_file);
> > + if (FAN_GROUP_FLAG(group, FAN_REPORT_TID))
> > + pidfd_flags |= PIDFD_THREAD;
> > +
> > + if (metadata.pid)
> > + pidfd = pidfd_prepare(event->pid, pidfd_flags, &pidfd_file);
> >
> > if (!FAN_GROUP_FLAG(group, FAN_REPORT_FD_ERROR) && pidfd < 0)
> > pidfd = pidfd == -ESRCH ? FAN_NOPIDFD : FAN_EPIDFD;
> > @@ -1628,14 +1617,6 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
> > #endif
> > return -EINVAL;
> >
> > - /*
> > - * A pidfd can only be returned for a thread-group leader; thus
> > - * FAN_REPORT_PIDFD and FAN_REPORT_TID need to remain mutually
> > - * exclusive.
> > - */
> > - if ((flags & FAN_REPORT_PIDFD) && (flags & FAN_REPORT_TID))
> > - return -EINVAL;
> > -
> > /* Don't allow mixing mnt events with inode events for now */
> > if (flags & FAN_REPORT_MNT) {
> > if (class != FAN_CLASS_NOTIF)
> > --
> > 2.54.0
> >
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR