Re: [PATCH] mm/alloc_tag: add the ARCH_NEEDS_WEAK_PER_CPU macro when statically defining the percpu variable alloc_tag_counters.

From: Hao Ge
Date: Wed Jun 11 2025 - 21:39:43 EST



On 2025/6/11 23:24, Suren Baghdasaryan wrote:
On Tue, Jun 10, 2025 at 10:27 PM Hao Ge <hao.ge@xxxxxxxxx> wrote:

On 2025/6/10 00:39, Suren Baghdasaryan wrote:
On Sun, Jun 8, 2025 at 11:08 PM Hao Ge <hao.ge@xxxxxxxxx> wrote:
On 2025/5/29 15:35, Hao Ge wrote:
From: Hao Ge <gehao@xxxxxxxxxx>

Recently discovered this entry while checking kallsyms on ARM64:
ffff800083e509c0 D _shared_alloc_tag

If ARCH_NEEDS_WEAK_PER_CPU is not defined,there's no need to statically
define the percpu variable alloc_tag_counters.

Therefore,add therelevant macro guards at the appropriate location.

Fixes: 22d407b164ff ("lib: add allocation tagging support for memory allocation profiling")
Signed-off-by: Hao Ge <gehao@xxxxxxxxxx>
---
lib/alloc_tag.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
index c7f602fa7b23..d1dab80b70ad 100644
--- a/lib/alloc_tag.c
+++ b/lib/alloc_tag.c
@@ -24,8 +24,10 @@ static bool mem_profiling_support;

static struct codetag_type *alloc_tag_cttype;

+#ifdef ARCH_NEEDS_WEAK_PER_CPU
DEFINE_PER_CPU(struct alloc_tag_counters, _shared_alloc_tag);
EXPORT_SYMBOL(_shared_alloc_tag);
+#endif /* ARCH_NEEDS_WEAK_PER_CPU */

DEFINE_STATIC_KEY_MAYBE(CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT,
mem_alloc_profiling_key);
Hi Suren


I'm sorry to bother you. As mentioned in my commit message,

in fact, on the ARM64 architecture, the _shared_alloc_tag percpu
variable is not needed.

In my understanding, it will create a copy for each CPU.

The alloc_tag_counters variable will occupy 16 bytes,

and as the number of CPUs increases, more and more memory will be wasted
in this segment.

I realized that this modification was a mistake. It resulted in a build
error, and the link is as follows:

https://lore.kernel.org/all/202506080448.KWN8arrX-lkp@xxxxxxxxx/

After I studied the comments of DECLARE_PER_CPU_SECTION, I roughly
understood why this is the case.

But so far, I haven't come up with a good way to solve this problem. Do
you have any suggestions?
Hi Hao,
The problem here is that ARCH_NEEDS_WEAK_PER_CPU is not a Kconfig
option, it gets defined only on 2 architectures and only when building
modules here https://elixir.bootlin.com/linux/v6.15.1/source/arch/alpha/include/asm/percpu.h#L14
and here https://elixir.bootlin.com/linux/v6.15.1/source/arch/s390/include/asm/percpu.h#L21.
A nicer way to deal with that is to make if a Kconfig option which is
enabled only for alpha and s390 and then do something like this:

#if defined(MODULE) && defined(ARCH_NEEDS_WEAK_PER_CPU)
#define MODULE_NEEDS_WEAK_PER_CPU
#endif

and change all the usages of ARCH_NEEDS_WEAK_PER_CPU with
MODULE_NEEDS_WEAK_PER_CPU.
Did I explain the idea clearly?
Thanks,
Suren.

Hi Suren
Hi Hao,

Thanks for your guidance.
I understand this train of thought.

I've been thinking about a problem: I only added the
ARCH_NEEDS_WEAK_PER_CPU

macro to isolate the definition of _shared_alloc_tag.

Since s390 defines this macro, why did this build error occur?
Hi Suren
The problem is that ARCH_NEEDS_WEAK_PER_CPU is not a Kconfig option,
it's just a definition, for s390 it's here:
https://elixir.bootlin.com/linux/v6.15.1/source/arch/s390/include/asm/percpu.h#L21
So, even for s390 if you are building core kernel code (not a module),
ARCH_NEEDS_WEAK_PER_CPU will be undefined, however if you are building
a module on s390 then it is defined. So, your change effectively
results in _shared_alloc_tag being compiled out in the core kernel
while it's used when you build a module. Therefore during linking
modules can't link to that symbol in the core kernel. Hope this
explains the issue.


Thank you so, so, so much! I understand now, and thank you for such a detailed explanation.


The way I would fix this is by making ARCH_NEEDS_WEAK_PER_CPU a
Kconfig option and enable it for s390 and alpha, would replace old
definitions from
https://elixir.bootlin.com/linux/v6.15.1/source/arch/s390/include/asm/percpu.h#L21
and https://elixir.bootlin.com/linux/v6.15.1/source/arch/alpha/include/asm/percpu.h#L14
with:

#if defined(MODULE) && defined(ARCH_NEEDS_WEAK_PER_CPU)
#define MODULE_NEEDS_WEAK_PER_CPU
#endif

Then use MODULE_NEEDS_WEAK_PER_CPU instead of ARCH_NEEDS_WEAK_PER_CPU
in all the current places in the kernel code. Lastly, to compile out
_shared_alloc_tag your current patch should work fine because on s390
and alpha ARCH_NEEDS_WEAK_PER_CPU will be defined after all these
changes.
Does that make sense?


I quite agree with this approach.

Thanks
Best Regards
Hao

Could you please help explain it again?

Thanks
Best Regards
Hao

Thanks

Best Regards

Hao