Re: [PATCH V3 1/2] mm/khugepaged: fix race with folio split/free using temporary reference

From: David Hildenbrand
Date: Tue May 27 2025 - 04:41:02 EST


On 27.05.25 10:06, Dev Jain wrote:

On 27/05/25 1:18 pm, David Hildenbrand wrote:
On 27.05.25 05:20, Dev Jain wrote:

On 26/05/25 11:58 pm, Shivank Garg wrote:
hpage_collapse_scan_file() calls is_refcount_suitable(), which in turn
calls folio_mapcount(). folio_mapcount() checks folio_test_large()
before
proceeding to folio_large_mapcount(), but there is a race window
where the
folio may get split/freed between these checks, triggering:

    VM_WARN_ON_FOLIO(!folio_test_large(folio), folio)

Take a temporary reference to the folio in hpage_collapse_scan_file().
This stabilizes the folio during refcount check and prevents incorrect
large folio detection due to concurrent split/free. Use helper
folio_expected_ref_count() + 1 to compare with folio_ref_count()
instead of using is_refcount_suitable().

Fixes: 05c5323b2a34 ("mm: track mapcount of large folios in single
value")
Reported-by: syzbot+2b99589e33edbe9475ca@xxxxxxxxxxxxxxxxxxxxxxxxx
Closes:
https://lore.kernel.org/all/6828470d.a70a0220.38f255.000c.GAE@xxxxxxxxxx

Suggested-by: David Hildenbrand <david@xxxxxxxxxx>
Acked-by: David Hildenbrand <david@xxxxxxxxxx>
Signed-off-by: Shivank Garg <shivankg@xxxxxxx>
---

The patch looks fine.

I was just wondering about the implications of this on migration.
Earlier
we had a refcount race between migration and shmem page fault via
filemap_get_entry()
taking a reference and not releasing it till we take the folio lock,
which was held
by the migration path. I would like to *think* that real workloads,
when migrating
pages, will *not* be faulting on those pages simultaneously, just
guessing. But now
we have a kernel thread (khugepaged) racing against migration. I may
just be over-speculating.

I'm not quite sure I understand the concern you have. Any temporary
reference can temporarily block migration, however, the retry logic
should be able to handle that just fine -- and this code is not really
special (see filemap_get_entry()).


You are correct that any temp ref can block migration, however, that
reference has to come after the folios have been isolated in the
migration path.

So the probability of someone taking a reference on the folio is quite
low since it has been isolated. The problem with filemap_get_entry() is
that it finds

the folio in the pagecache, so isolation is useless, then takes a
reference and then shmem_get_folio_gfp() does a folio_lock() instead of
folio_try_lock().

This was the race which I talked about an year back at [1]. My concern
is that we are adding another candidate to that race; just wondering if
there is

a better solution to fix the race mentioned in Shivank's patchset.

Note that in this code here, we're not locking the folio just yet, we're only grabbing a reference for a very short time.

We will isolate+lock the folios in collapse_file(), where we also lock all slots in the pagecache.

The alternative would be to also lock all slots here, which is arguably worse.

--
Cheers,

David / dhildenb