Re: [PATCH] kbuild: try readelf first in gen_symversions
From: Wentao Guan
Date: Fri Jun 05 2026 - 07:11:42 EST
Hello,
> This should probably be CONFIG_LTO_CLANG with flipped branches but...
Right!
> '-m1' appears to get us 50% (12s) of the speed up of 'readelf' (24s) in
> your environment while sticking with 'nm'. I would be more inclined to
> take that change since it is small and correct, rather than switching on
> NM or READELF, as I don't think it is worth the additional complexity.
> FWIW, on one of my test machines with 8 cores and 16 threads, the
> difference is much less noticeable. I think that is going to be in line
> with most developer and build farm hardware, rather than a 2C/4T machine
> like you mention in the initial commit message.
Sorry, it seems my cloud servies provider cause my results up and down:(,
also maybe first compile time not stable, so I tested in a 20 cores/28 threads
bare metal envirment , here is the result:
Intel(R) Core(TM) i7-14700HX + 32GB + NVMe ssd
gcc version 12.3.0 binutils 2.46
clang version 18.1.7
source kernel tag v7.0
summary:
1. still benifit from nm to readelf in 20core/28threads
(I think there more costs in libbfd in nm, show high cost down in sys time,
I guess it cause more memory acces bottle neck to effect overall compile process)
but seems no these different when change llvm-18-nm to llvm-18-readelf
2. -m1 seems no expect effect...
test scripts:
https://gist.github.com/opsiff/832baa9a6986343dddbe530fbee57f52
Makefile.build-nm-m1 : 'grep -q' -> 'grep -m1 -q'
Makefile.build-orig : orig Makefile.build
Makefile.build-readelf : 'NM' -> 'READELF -sW'
Makefile.build-readelf-m1: 'NM' -> 'READELF -sW' , 'grep -q' -> 'grep -m1 -q'
full result:
1. run x86_64_defconfig + modversions x3(base)
if $(NM) $@ 2>/dev/null | grep -q ' __export_symbol_'; then \
real 2m2.876s real 2m2.578s real 2m2.262s
user 42m15.871s user 42m35.250s user 42m33.679s
sys 5m52.904s sys 5m52.478s sys 5m49.009s
2. if $(READELF) -sW $@ 2>/dev/null | grep -q __export_symbol_; then
real 1m54.931s real 1m55.192s real 1m55.207s
user 41m4.162s user 41m7.754s user 41m5.791s
sys 4m8.422s sys 4m8.431s sys 4m9.219s
3. if $(NM) $@ 2>/dev/null | grep -m1 -q __export_symbol_; then \
real 2m1.865s real 2m1.866s real 2m2.108s
user 42m32.891s user 42m35.047s user 42m33.834s
sys 5m48.045s sys 5m47.700s sys 5m48.200s
4. if $(READELF) -sW $@ 2>/dev/null | grep -m1 -q ' __export_symbol_'; then \
real 1m55.386s real 1m56.528s real 1m55.489s
user 41m6.156s user 41m12.321s user 41m10.545s
sys 4m10.093s sys 4m9.838s sys 4m9.367s
5. LLVM run x86_64_defconfig + modversions x3(base)
if $(NM) $@ 2>/dev/null | grep -q ' __export_symbol_'; then \
real 2m35.758s real 2m32.696s real 2m32.127s
user 58m2.416s user 57m55.030s user 57m54.806s
sys 4m20.735s sys 4m18.473s sys 4m18.090s
6. LLVM if $(READELF) -sW $@ 2>/dev/null | grep -q ' __export_symbol_'; then \
real 2m32.448s real 2m32.419s real 2m32.509s
user 57m57.262s user 57m53.001s user 57m48.842s
sys 4m20.508s sys 4m20.693s sys 4m20.490s
7. LLVM if $(NM) $@ 2>/dev/null | grep -m1 -q ' __export_symbol_'; then \
real 2m32.003s real 2m31.900s real 2m32.276s
user 57m45.786s user 57m46.982s user 57m49.907s
sys 4m18.184s sys 4m17.923s sys 4m18.354s
8. LLVM if $(READELF) -sW $@ 2>/dev/null | grep -m1 -q ' __export_symbol_'; then \
real 2m33.365s real 2m32.186s real 2m32.114s
user 57m49.533s user 57m47.865s user 57m46.591s
sys 4m19.809s sys 4m20.652s sys 4m19.954s
9. LLVM LTO_THIN run x86_64_defconfig + modversions x3(base)
if $(NM) $@ 2>/dev/null | grep -q ' __export_symbol_'; then \
real 3m59.411s real 3m55.945s real 3m56.557s
user 59m38.877s user 59m20.007s user 59m19.009s
sys 4m21.582s sys 4m22.313s sys 4m23.793s
10. LLVM LTO_THIN if $(NM) $@ 2>/dev/null | grep -m1 -q ' __export_symbol_'; then \
real 3m55.722s real 3m56.641s real 3m57.979s
user 59m21.865s user 59m25.634s user 59m20.872s
sys 4m21.303s sys 4m24.174s sys 4m22.695s
Full log:
https://gist.github.com/opsiff/1cd7e0a0553c8416dd13a7e92590a440
If you have any other ideas, i will happly to test them,
i will try to use llvm-nm instead of nm to test.
BRs
Wentao Guan