Re: [PATCH v3] mm: process_mrelease: introduce PROCESS_MRELEASE_REAP_KILL flag

Next message: Theodore Tso: "Re: [f2fs-dev] [PATCH v2] f2fs: another way to set large folio by remembering inode number"
Previous message: Ian Rogers: "[RFC PATCH v1 13/14] perf stat: Add --new support to PMU metrics Python validator"
In reply to: Minchan Kim: "Re: [PATCH v3] mm: process_mrelease: introduce PROCESS_MRELEASE_REAP_KILL flag"
Next in thread: Linus Torvalds: "Re: [PATCH v3] mm: process_mrelease: introduce PROCESS_MRELEASE_REAP_KILL flag"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Linus Torvalds

Date: Fri May 22 2026 - 18:42:24 EST

On Tue, 19 May 2026 at 13:53, Minchan Kim <minchan@xxxxxxxxxx> wrote:
>
> First, the -ESRCH race remains unsolved.

Do we really care?

> Without Jann's patch to preserve the mm pointer via task->exit_mm, the
> userspace killer won't even have a chance to attempt reaping.

.. but without the mm_users grab, the memory gets torn down
regardless. The mm_struct remains, but none of the page tables would.

So no, it won't attempt reaping and will get -ESRCH, but the process
will be gone, so what's the difference?

> Second, the latency bottleneck transfers from mmput() to mmap_lock.
> If a low-priority procfs reader is preempted or stalled while holding the
> mmap_read_lock, the exiting process calling exit_mmap() will block indefinitely
> when trying to acquire the mmap_write_lock.

Yes. However, that really does sound like at this point it's no worse
than the PIDFD_SIGNAL_PROCESS_GROUP_EXPEDITE you suggest. That needs
the mmap lock too.

One thing that makes /proc/pid/smap worse is that m_start will take
the lock even completely pointlessly, over and over again.

Even if we just remove the mmget_not_zero() - which I think we should
do, and just rely on the mm_count of the open/close - it *keeps* doing
that silly lock_vma_range() using lock_ctx->mm - even if the task mm
has long since gone away.

So I think that the code should - in addition to not taking the
mm_users count - then also do

if (!task->mm) return -ESRCH;

in m_start(), which should simply stop doing any pointless work on a
VM that no longer exist.

At that point it will go through at most one iteration of show_smap()
with the lock held, but that will be true even for your
PIDFD_SIGNAL_PROCESS_GROUP_EXPEDITE case.

Linus