Re: [patch] mm, page_alloc: reintroduce page allocation stall warning

From: Vlastimil Babka (SUSE)

Date: Mon Mar 30 2026 - 11:29:29 EST


On 3/30/26 15:54, Michal Hocko wrote:
> On Sun 29-03-26 18:08:52, David Rientjes wrote:
>> Previously, we had warnings when a single page allocation took longer
>> than reasonably expected. This was introduced in commit 63f53dea0c98
>> ("mm: warn about allocations which stall for too long").
>>
>> The warning was subsequently reverted in commit 400e22499dd9 ("mm: don't
>> warn about allocations which stall for too long") but for reasons
>> unrelated to the warning itself.
>
> I think it makes sense to summarize reasons for the revert. I would
> propose to change the above to somehting like
> "
> The warning was subsequently reverted in commit 400e22499dd9 ("mm: don't
> warn about allocations which stall for too long") because it was
> possible to generate memory pressure that would effectivelly stall
> further progress through printk execution.
> "
>
>> @@ -4841,6 +4884,9 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>> if (current->flags & PF_MEMALLOC)
>> goto nopage;
>>
>> + /* If allocation has taken excessively long, warn about it */
>> + check_alloc_stall_warn(gfp_mask, ac->nodemask, order, alloc_start_time);
>> +
>> /* Try direct reclaim and then allocating */
>> if (!compact_first) {
>> page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags,
>
> Is there any specific reason for this placement? Compaction can take
> quite some time as well.

It seems fine to me - as longs as the slowpath is retrying for 10 seconds
and still can't obtain a page, there's a warning.

We don't catch cases when either the get_page_from_freelist() attempt,
direct compaction or direct reclaim attempt is what gets us over 10 seconds,
and at the same time it results in success. If that's a concern, we should
add another check_alloc_stall_warn() call under got_pg label (as the RFC
had) - I'm not sure it's all achievable with a single place with the call.