Re: [PATCH v3] perf: Add is_mapping_symbol() helper for kernel mapping symbol filtering
From: Rui Qi
Date: Fri May 22 2026 - 03:40:10 EST
On 5/7/26 11:23 PM, Ian Rogers wrote:
> On Thu, May 7, 2026 at 12:11 AM Rui Qi <qirui.001@xxxxxxxxxxxxx> wrote:
>>
>> The perf tool currently has ad-hoc logic to filter out ELF mapping
>> symbols scattered across multiple files. ARM, AArch64 and RISC-V each
>> have their own inline checks in dso__load_sym_internal(), and kallsym
>> processing has yet another check for ARM module symbols.
>>
>> This is fragile: adding support for a new architecture or adjusting
>> which prefixes are considered mapping symbols requires touching
>> multiple places, and it is easy for the checks to diverge. It also
>> does not match the kernel's own is_mapping_symbol() logic, which
>> additionally covers x86 local symbols (".L*" and "L0*").
>>
>> Introduce a single is_mapping_symbol() inline helper in symbol.h and
>> convert all kernel symbol handling to use it. The helper covers the
>> existing "$" prefix used by ARM, AArch64 and RISC-V, and also adds
>> the x86 local symbol prefixes so that perf stays consistent with
>> the kernel.
>>
>> Signed-off-by: Rui Qi <qirui.001@xxxxxxxxxxxxx>
>> ---
>> Changes in v3:
>> - Add is_mapping_symbol() check for kernel modules in dso__load_sym_internal()
>> - Add is_mapping_symbol() check in machine__process_ksymbol_unregister()
>>
>> Link (v2): https://lore.kernel.org/all/20260506073820.2419087-1-qirui.001@xxxxxxxxxxxxx/
>>
>> Changes in v2:
>> - Only apply is_mapping_symbol() filtering to kernel symbols (kallsyms
>> and ksymbol events), not to user-space symbols from ELF files,
>> BFD libraries, or perf map files. This avoids incorrectly
>> discarding valid user-space function names that start with '$',
>> which is a legal character in identifiers for many languages
>> (e.g., Java, Scala) and compilers (GCC).
>> - Move the mapping symbol check in machine__process_ksymbol_register()
>> to the beginning of the function, before any map/dso allocation
>> or insertion, to avoid leaving empty maps in the kernel map tree.
>>
>> Link (v1): https://lore.kernel.org/all/20260504090609.1801880-1-qirui.001@xxxxxxxxxxxxx/
>> ---
>> tools/perf/util/machine.c | 12 +++++++++++-
>> tools/perf/util/symbol-elf.c | 8 ++++++++
>> tools/perf/util/symbol.c | 4 ++--
>> tools/perf/util/symbol.h | 15 +++++++++++++++
>> 4 files changed, 36 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
>> index e76f8c86e62a..4e33ba06111d 100644
>> --- a/tools/perf/util/machine.c
>> +++ b/tools/perf/util/machine.c
>> @@ -729,9 +729,15 @@ static int machine__process_ksymbol_register(struct machine *machine,
>> {
>> struct symbol *sym;
>> struct dso *dso = NULL;
>> - struct map *map = maps__find(machine__kernel_maps(machine), event->ksymbol.addr);
>> + struct map *map;
>> int err = 0;
>>
>> + /* Ignore mapping symbols in ksymbol events - check early before any state mutation */
>> + if (is_mapping_symbol(event->ksymbol.name))
>> + return 0;
>> +
>> + map = maps__find(machine__kernel_maps(machine), event->ksymbol.addr);
>> +
>> if (!map) {
>> dso = dso__new(event->ksymbol.name);
>>
>> @@ -790,6 +796,10 @@ static int machine__process_ksymbol_unregister(struct machine *machine,
>> struct symbol *sym;
>> struct map *map;
>>
>> + /* Ignore mapping symbols in ksymbol events */
>> + if (is_mapping_symbol(event->ksymbol.name))
>> + return 0;
>> +
>> map = maps__find(machine__kernel_maps(machine), event->ksymbol.addr);
>> if (!map)
>> return 0;
>> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
>> index 7afa8a117139..6b12508ea58d 100644
>> --- a/tools/perf/util/symbol-elf.c
>> +++ b/tools/perf/util/symbol-elf.c
>> @@ -1607,6 +1607,14 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
>> continue;
>> }
>>
>> + /*
>> + * For kernel modules, also reject x86 local symbols (.L* and L0*)
>> + * to match the kernel's is_mapping_symbol() logic and kallsyms
>> + * parsing behavior.
>> + */
>> + if (kmodule && is_mapping_symbol(elf_name))
>> + continue;
>> +
>> if (runtime_ss->opdsec && sym.st_shndx == runtime_ss->opdidx) {
>> u32 offset = sym.st_value - syms_ss->opdshdr.sh_addr;
>> u64 *opd = opddata->d_buf + offset;
>> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
>> index fcaeeddbbb6b..af03b16c17c6 100644
>> --- a/tools/perf/util/symbol.c
>> +++ b/tools/perf/util/symbol.c
>> @@ -770,8 +770,8 @@ static int map__process_kallsym_symbol(void *arg, const char *name,
>> if (!symbol_type__filter(type))
>> return 0;
>>
>> - /* Ignore local symbols for ARM modules */
>> - if (name[0] == '$')
>> + /* Ignore mapping symbols in kallsyms */
>> + if (is_mapping_symbol(name))
>> return 0;
>>
>> /*
>> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
>> index bd6eb90c8668..27fa1b43e6f1 100644
>> --- a/tools/perf/util/symbol.h
>> +++ b/tools/perf/util/symbol.h
>> @@ -28,6 +28,21 @@ struct maps;
>> struct option;
>> struct build_id;
>>
>> +/*
>> + * Ignore kernel mapping symbols, matching kernel is_mapping_symbol() logic.
>> + * This checks for '$' prefix (used by ARM, AArch64, RISC-V) and
>> + * x86 local symbol prefixes (.L* and L0*).
>> + * Only use this for kernel symbols (kallsyms, ksymbol events).
>> + */
>> +static inline bool is_mapping_symbol(const char *str)
>
> Is there a good reference for what is meant by "mapping symbol" ?
> Would "local symbol" be more appropriate for x86? On ARM it seems the
> term is well defined:
> https://developer.arm.com/documentation/dui0803/a/Accessing-and-managing-symbols-with-armlink/About-mapping-symbols
Good point. The term "mapping symbol" is indeed well-defined for ARM [1]
and RISC-V [2] — both psabi specifications use "$" prefix symbols to
mark boundaries between code and data regions, and both explicitly call
them "mapping symbols".
However, the ".L*" and "L0*" patterns on x86 are local labels, not
mapping symbols in the same sense. Grouping them under "mapping symbol"
is imprecise.
That said, this naming comes from the existing is_mapping_symbol() in
include/linux/module_symbol.h, which predates this patch. I just
mirrored the kernel-side function into tools/ with an expanded comment.
The terminology question applies to the kernel-side function as well.
I think the comment in my patch should at least not conflate the two.
How about something like:
/*
* Ignore symbols that should be skipped in kallsyms output.
* Matches kernel is_mapping_symbol() logic in module_symbol.h.
*
* '$' prefix: ARM/RISC-V mapping symbols ($a, $t, $d, $x, etc.)
* See ARM ABI [1] and RISC-V psabi [2].
* '.L*' and 'L0*': x86 local labels (not mapping symbols per se,
* but filtered the same way by the kernel).
*/
A separate cleanup to rename is_mapping_symbol() to something more
accurate (e.g. is_ignorable_symbol()) across the kernel could be done
later, but that's outside the scope of this bugfix.
[1]
https://developer.arm.com/documentation/dui0803/a/Accessing-and-managing-symbols-with-armlink/About-mapping-symbols
[2]
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-elf.adoc#mapping-symbol
> I'm wondering if we can make this more intention-revealing. I'm
> wondering also if we should make the check dependent on the e_machine,
> so perhaps:
> ```
> static inline is_ignored_kernel_symbol(const char *name, uint16_t e_machine)
> {
> if (e_machine == EM_386 || e_machine == EM_X86_64) {
> /* Local symbols on x86 may start .L or L0. */
> return(str[0] == '.' && str[1] == 'L') || (str[0] == 'L' && str[1] == '0';
> }
> /* All other machine types. Assume symbols starting $ are mapping
> symbols used to denote transitions between different sections of data
> and code. */
> return str[0] == '$';
> }
> ```
> I think you can use something like `dso__e_machine` at all call sites
> to get the ELF machine type for the binary or kernel, but maybe this
> is overkill.
>
> Thanks,
> Ian
>
Yes, it's overkill.
The current unconditional checks don't produce false positives across
architectures:
Checking .L / L0 on ARM: ARM assemblers never generate these symbols, so
the check never matches -- it's a no-op.
Checking $ on x86: x86 ELF symbols essentially never start with $, same
story.
The function is functionally correct as-is. Checking for a pattern that
doesn't exist in a given architecture is harmless.
The cost of introducing e_machine is real:
API break: every callsite (kernel/module/kallsyms.c,
scripts/mod/modpost.h) must now pass in architecture info.
e_machine retrieval is non-trivial: the kernel side and modpost side get
ELF headers through different paths, each needing separate adaptation.
Test matrix explosion: one codepath serving all architectures becomes
per-architecture branches, each needing independent verification.
The one thing worth fixing is the function name. is_mapping_symbol
is genuinely misleading -- .L/L0 are local labels, not mapping
symbols. But fixing the name doesn't require an e_machine parameter.
A simple rename to is_ignored_symbol would do.
>> +{
>> + if (str[0] == '.' && str[1] == 'L')
>> + return true;
>> + if (str[0] == 'L' && str[1] == '0')
>> + return true;
>> + return str[0] == '$';
>> +}
>> +
>> /*
>> * libelf 0.8.x and earlier do not support ELF_C_READ_MMAP;
>> * for newer versions we can use mmap to reduce memory usage:
>> --
>> 2.20.1