Re: Re: [PATCH] dcache: add fs.dentry-limit sysctl with negative-first reaper

From: Horst Birthelmer

Date: Mon May 18 2026 - 03:02:11 EST


On Mon, May 18, 2026 at 09:55:05AM +1000, NeilBrown wrote:
> On Fri, 15 May 2026, Horst Birthelmer wrote:
> > From: Horst Birthelmer <hbirthelmer@xxxxxxx>
> >
> > The dcache only shrinks under memory pressure, which is rarely reached
> > on machines with ample RAM, so cached negative dentries can accumulate
> > without bound. Give administrators a soft cap they can set,
> > and a background worker that prefers negative dentries when reclaiming.
> >
> > Two new sysctls under /proc/sys/fs/:
> >
> > dentry-limit -- soft cap on nr_dentry. 0 (default)
> > disables the feature; behaviour is then
> > identical to before.
>
> Is a system-wide cap really a suitable tool? What guidance would you
> give to sysadmins who are considering setting a number?

I know it is a rhetorical question ... nevertheless
It's a soft cap, so it depends on the number of open files usually floating
around on the machine. It even depends on the file systems. That was actually
my motivation (more than the negative entries). Some cache entries are
expensive for our fuse server due to our DLM usage and private data
held in user space.

> Is there a better approach?

After reading your thoughts and those of the others who have taken the time
to revisit this, I think there is no better solution in the VFS layer.

Since 2025 (commit 395b95530343e) shrink_dentry_list() is an exported symbol
and that can be used for a specific file system to do its own housekeeping.
This will probably be considered a misuse by some , but it would be more
specific and better controllable especially from filesystems where certain
cache entries are more expensive than others and/or running in user space (FUSE).

>
> According to the email you linked, a problem arises when a directory has
> a great many negative children. Code which walks the list of children
> (such as fsnotify) while holding a lock can suffer unpredictable delays
> and result in long lock-hold times. So maybe a limit on negative
> dentries for any parent is what we really want. That would be clumsy to
> implement I imagine.
>
> But what if we move dentries to the end of the list when they become
> negative, and to the start of the list when they become positive? Then
> code which walks the child list could simply abort on the first
> negative.
>
> I doubt that would be quite as easy as it sounds, but it would at least
> be more focused on the observed symptom rather than some whole-system
> number which only vaguely correlates with the observed symptom.
>
> Maybe a completely different approach: change children-walking code to
> drop and retake the lock (with appropriate validation) periodically.
> What too would address the specific symptom.
>
> Thanks for attempting to resolve this issue, but I'm not convinced that
> you have found a good solution yet.

Thanks for the clear words. I realy appreciate it!

>
> NeilBrown
>