Re: [PATCH v2] mm/mempolicy: track page allocations per mempolicy
From: Huang, Ying
Date: Tue Mar 17 2026 - 02:45:16 EST
"JP Kobryn (Meta)" <jp.kobryn@xxxxxxxxx> writes:
> On 3/15/26 7:54 PM, Huang, Ying wrote:
>> "JP Kobryn (Meta)" <jp.kobryn@xxxxxxxxx> writes:
>>
>>> On 3/13/26 12:34 AM, Vlastimil Babka (SUSE) wrote:
>>>> On 3/13/26 07:14, JP Kobryn (Meta) wrote:
>>>>> On 3/12/26 10:07 PM, Huang, Ying wrote:
>>>>>> "JP Kobryn (Meta)" <jp.kobryn@xxxxxxxxx> writes:
>>>>>>
>>>>>>> On 3/12/26 6:40 AM, Vlastimil Babka (SUSE) wrote:
>>>>>>>
>>>>>>> How about I change from per-policy hit/miss/foreign triplets to a single
>>>>>>> aggregated policy triplet (i.e. just 3 new counters which account for
>>>>>>> all policies)? They would follow the same hit/miss/foreign semantics
>>>>>>> already proposed (visible in quoted text above). This would still
>>>>>>> provide the otherwise missing signal of whether policy-driven
>>>>>>> allocations to a node are intentional or fallback.
>>>>>>>
>>>>>>> Note that I am also planning on moving the stats off of the memcg so the
>>>>>>> 3 new counters will be global per-node in response to similar feedback.
>>>>>>
>>>>>> Emm, what's the difference between these newly added counters and the
>>>>>> existing numa_hit/miss/foreign counters?
>>>>>
>>>>> The existing counters don't account for node masks in the policies that
>>>>> make use of them. An allocation can land on a node in the mask and still
>>>>> be considered a miss because it wasn't the preferred node.
>>>> That sounds like we could just a new counter e.g. numa_hit_preferred
>>>> and
>>>> adjust definitions accordingly? Or some other variant that fills the gap?
>>>
>>> It's an interesting thought. Looking into these existing counters more,
>>> the in-kernel direct node allocations, which don't fall under any
>>> mempolicy, are also included in these stats. One good example might be
>>> include/linux/skbuff.h, where __dev_alloc_pages() calls
>>> alloc_pages_node_noprof(NUMA_NO_NODE, ...) which eventually reaches
>>> zone_statistics() and increments the stats.
>> IIUC, the default memory policy is used here, that is, MPOL_LOCAL.
>
> I'm not seeing that. zone_statistics() is eventually reached.
> alloc_pages_mpol() is not.
Yes. The page isn't allocated through alloc_pages_mpol(). For example,
if we want to allocate pages for the kernel instead of user space
applications. However, IMHO, the equivalent memory policy is
MPOL_LOCAL, that is, allocate from local node firstly, then fallback to
other nodes. I don't think that alloc_pages_mpol() is so special.
---
Best Regards,
Huang, Ying