Re: [PATCH] mm: page_isolation: Avoid hugepage scan step underflow

From: David Hildenbrand (Arm)

Date: Tue Jun 02 2026 - 05:40:58 EST

On 6/2/26 09:08, Kaitao Cheng wrote:
> 在 2026/6/2 00:21, David Hildenbrand (Arm) 写道:
>> On 5/22/26 11:35, Kaitao Cheng wrote:
>>> 在 2026/5/20 01:54, Andrew Morton 写道:
>>>
>>> The direct user-visible effect is not memory corruption, but degraded forward
>>> progress in the page isolation / contiguous allocation path.
>>>
>>> If the race makes folio_nr_pages() return 1 while the current page is still
>>> treated as a tail page of the old HugeTLB folio, the computed step can
>>> underflow:
>>>
>>> step = folio_nr_pages(folio) - folio_page_idx(folio, page);
>>>
>>> The caller then does:
>>>
>>> start_pfn += step;
>>>
>>> With unsigned arithmetic this can wrap and move start_pfn backwards, typically
>>> near the beginning of the old hugepage range rather than advancing past it. In
>>> many cases this only means rescanning part of the same hugepage range, so the
>>> effect may be limited to extra scanning work.
>>>
>>> However, it still violates the scanner's forward-progress assumption: step
>>> is expected to advance start_pfn. If the same transient state is observed
>>> repeatedly, the scanner can keep revisiting the same PFNs, causing excessive
>>> latency and, in the worst case, an apparent stall in operations that rely on
>>> page isolation or contiguous allocation.
>>>
>>>
>>> Here is another point raised by AI, the old code also used folio_test_lru()
>>> on a folio pointer obtained without holding a reference. If the folio is
>>> freed and the old head page is reused or observed as a tail page of another
>>> compound page, folio_test_lru() can reach const_folio_flags(), which asserts
>>> that the passed folio is not a tail page. On DEBUG_VM kernels, that can
>>> trigger a VM_BUG_ON_PGFLAGS() crash.
>>>
>>>
>>> Following David Hildenbrand's suggestion, I made some changes as shown below.
>>> I'm not sure whether there are still any other issues.
>>>
>>> --- a/mm/page_isolation.c
>>> +++ b/mm/page_isolation.c
>>> @@ -41,8 +41,14 @@ bool page_is_unmovable(struct zone *zone, struct page *page,
>>> * We need not scan over tail pages because we don't
>>> * handle each tail page individually in migration.
>>> */
>>> - if (PageHuge(page) || PageCompound(page)) {
>>> + if (PageCompound(page)) {
>>> struct folio *folio = page_folio(page);
>>> + unsigned long nr_pages, pfn;
>>> + unsigned int order;
>>> +
>>> + order = compound_order(&folio->page);
>>> + if (order > MAX_FOLIO_ORDER)
>>> + return true;
>>
>> Would we also have to care about non-order-of-2?
>
> There should not be any non-order-of-2 compound pages here. Neither the
> buddy allocator's allocation model nor the folio_size() / size_to_hstate()
> implementation supports non-order-of-two compound pages.

Ah, sorry, in scan_movable_pages() we use folio_nr_pages(), and that can give
you on races a non-order-of-2.

So yeah, clearly not required here.

>
>>>
>>> if (folio_test_hugetlb(folio)) {
>>> struct hstate *h;
>>> @@ -54,15 +60,16 @@ bool page_is_unmovable(struct zone *zone, struct page *page,
>>> * The huge page may be freed so can not
>>> * use folio_hstate() directly.
>>> */
>>> - h = size_to_hstate(folio_size(folio));
>>> - if (h && !hugepage_migration_supported(h))
>>> + h = size_to_hstate(PAGE_SIZE << order);
>>> + if (!h || !hugepage_migration_supported(h))
>>> return true;
>>> -
>>> - } else if (!folio_test_lru(folio)) {
>>> + } else if (!PageLRU(page)) {
>>
>> Hm, is that required because we could VM_BUG_ON?
>
> Yes, that is one of the reasons for the change.
>
> page_is_unmovable() does not hold a reference on the folio, and the folio
> can be freed and reused concurrently while this scanner is running. In that
> case, the previously obtained folio pointer may no longer refer to a valid
> folio head. If the original head page is reused as a tail page of another
> compound page, folio_test_lru(folio) can hit the folio flag checks and
> potentially trigger a VM_BUG_ON_PGFLAGS().

Okay, spell that out in the patch description.

--
Cheers,

David