[linux-next:master] [randomize_kstack] a96ef5848c: will-it-scale.per_thread_ops 7.7% improvement
From: kernel test robot
Date: Tue Mar 31 2026 - 04:47:54 EST
Hello,
kernel test robot noticed a 7.7% improvement of will-it-scale.per_thread_ops on:
commit: a96ef5848cb096226bf6aff31a90d8b136d99b71 ("randomize_kstack: Unify random source across arches")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:
nr_task: 100%
mode: thread
test: lseek1
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+--------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops 4.7% improvement |
| test parameters | cpufreq_governor=performance |
| | mode=thread |
| | nr_task=100% |
| | test=getppid1 |
+------------------+--------------------------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260331/202603311659.6aa92f2c-lkp@xxxxxxxxx
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-14/performance/x86_64-rhel-9.4/thread/100%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp2/lseek1/will-it-scale
commit:
37beb42560 ("randomize_kstack: Maintain kstack_offset per task")
a96ef5848c ("randomize_kstack: Unify random source across arches")
37beb42560165869 a96ef5848cb096226bf6aff31a9
---------------- ---------------------------
%stddev %change %stddev
\ | \
1.474e+09 +7.7% 1.588e+09 will-it-scale.192.threads
7675604 +7.7% 8270154 will-it-scale.per_thread_ops
1.474e+09 +7.7% 1.588e+09 will-it-scale.workload
37.77 -16.4% 31.57 vmstat.cpu.us
60.95 +6.2 67.17 mpstat.cpu.all.sys%
38.17 -6.3 31.90 mpstat.cpu.all.usr%
0.96 +17.7% 1.13 turbostat.IPC
442.92 +3.6% 459.07 turbostat.PkgWatt
0.01 ± 22% -28.9% 0.01 ± 20% perf-stat.i.MPKI
1.295e+11 +11.4% 1.442e+11 perf-stat.i.branch-instructions
7450278 ± 77% +144.3% 18200516 ± 35% perf-stat.i.branch-misses
4646936 ± 5% -10.0% 4180940 ± 5% perf-stat.i.cache-references
1.04 -15.2% 0.88 perf-stat.i.cpi
5.856e+11 +18.0% 6.907e+11 perf-stat.i.instructions
0.96 +18.0% 1.13 perf-stat.i.ipc
0.00 ± 3% -15.2% 0.00 ± 2% perf-stat.overall.MPKI
0.01 ± 77% +0.0 0.01 ± 35% perf-stat.overall.branch-miss-rate%
10.98 ± 5% +1.3 12.24 ± 5% perf-stat.overall.cache-miss-rate%
1.04 -15.3% 0.88 perf-stat.overall.cpi
0.96 +18.0% 1.13 perf-stat.overall.ipc
119899 +9.5% 131347 perf-stat.overall.path-length
1.29e+11 +11.4% 1.437e+11 perf-stat.ps.branch-instructions
7425104 ± 77% +144.1% 18126281 ± 35% perf-stat.ps.branch-misses
4734855 ± 5% -10.3% 4248990 ± 5% perf-stat.ps.cache-references
5.837e+11 +18.0% 6.885e+11 perf-stat.ps.instructions
1.767e+14 +18.0% 2.086e+14 perf-stat.total.instructions
8.77 ± 2% -8.8 0.00 perf-profile.calltrace.cycles-pp.arch_exit_to_user_mode_prepare.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
39.72 -8.2 31.51 ± 5% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
42.72 -8.0 34.73 ± 5% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.llseek
26.12 -2.5 23.64 ± 4% perf-profile.calltrace.cycles-pp.__x64_sys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
13.12 ± 3% -1.7 11.38 ± 6% perf-profile.calltrace.cycles-pp.fdget_pos.__x64_sys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
7.17 ± 3% -1.4 5.74 ± 7% perf-profile.calltrace.cycles-pp.__fget_files.fdget_pos.__x64_sys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.86 -0.3 4.53 ± 4% perf-profile.calltrace.cycles-pp.mutex_unlock.__x64_sys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
1.39 +0.3 1.68 ± 7% perf-profile.calltrace.cycles-pp.lseek@plt
2.48 ± 2% +0.5 3.02 ± 8% perf-profile.calltrace.cycles-pp.testcase
1.20 ± 5% +1.2 2.39 ± 8% perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
0.00 +1.4 1.40 ± 8% perf-profile.calltrace.cycles-pp.prandom_u32_state.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
39.88 +7.5 47.39 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.llseek
8.77 ± 2% -8.5 0.29 ± 5% perf-profile.children.cycles-pp.arch_exit_to_user_mode_prepare
39.86 -8.2 31.69 ± 5% perf-profile.children.cycles-pp.do_syscall_64
42.77 -8.0 34.78 ± 5% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
26.26 -2.4 23.83 ± 4% perf-profile.children.cycles-pp.__x64_sys_lseek
13.22 ± 3% -1.8 11.44 ± 6% perf-profile.children.cycles-pp.fdget_pos
7.24 ± 3% -1.4 5.79 ± 7% perf-profile.children.cycles-pp.__fget_files
98.29 -0.4 97.92 perf-profile.children.cycles-pp.llseek
4.92 -0.3 4.59 ± 4% perf-profile.children.cycles-pp.mutex_unlock
0.20 ± 2% -0.0 0.18 ± 2% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.19 -0.0 0.16 ± 2% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.18 -0.0 0.15 ± 3% perf-profile.children.cycles-pp.hrtimer_interrupt
0.18 ± 2% -0.0 0.16 ± 2% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.10 ± 3% -0.0 0.08 ± 4% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.10 -0.0 0.08 ± 4% perf-profile.children.cycles-pp.tick_nohz_handler
0.08 -0.0 0.06 ± 7% perf-profile.children.cycles-pp.update_process_times
0.76 ± 2% +0.2 0.92 ± 7% perf-profile.children.cycles-pp.lseek@plt
2.43 ± 2% +0.5 2.95 ± 8% perf-profile.children.cycles-pp.testcase
1.22 ± 4% +1.2 2.45 ± 8% perf-profile.children.cycles-pp.x64_sys_call
0.00 +1.4 1.40 ± 8% perf-profile.children.cycles-pp.prandom_u32_state
26.94 +4.7 31.60 ± 4% perf-profile.children.cycles-pp.entry_SYSCALL_64
8.72 ± 2% -8.5 0.23 ± 5% perf-profile.self.cycles-pp.arch_exit_to_user_mode_prepare
7.18 ± 3% -1.4 5.78 ± 7% perf-profile.self.cycles-pp.__fget_files
4.84 -0.3 4.52 ± 4% perf-profile.self.cycles-pp.mutex_unlock
1.07 ± 3% -0.1 0.94 ± 7% perf-profile.self.cycles-pp.fdget_pos
0.06 ± 7% +0.0 0.08 ± 10% perf-profile.self.cycles-pp.lseek@plt
3.62 +0.2 3.83 ± 4% perf-profile.self.cycles-pp.do_syscall_64
1.68 ± 2% +0.4 2.05 ± 8% perf-profile.self.cycles-pp.testcase
1.16 ± 5% +1.3 2.41 ± 8% perf-profile.self.cycles-pp.x64_sys_call
0.00 +1.3 1.33 ± 8% perf-profile.self.cycles-pp.prandom_u32_state
13.18 ± 2% +1.7 14.89 ± 6% perf-profile.self.cycles-pp.entry_SYSCALL_64
18.28 +3.9 22.16 ± 5% perf-profile.self.cycles-pp.llseek
***************************************************************************************************
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-14/performance/x86_64-rhel-9.4/thread/100%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp2/getppid1/will-it-scale
commit:
37beb42560 ("randomize_kstack: Maintain kstack_offset per task")
a96ef5848c ("randomize_kstack: Unify random source across arches")
37beb42560165869 a96ef5848cb096226bf6aff31a9
---------------- ---------------------------
%stddev %change %stddev
\ | \
1.987e+09 +4.7% 2.079e+09 will-it-scale.192.threads
10346487 +4.7% 10828131 will-it-scale.per_thread_ops
1.987e+09 +4.7% 2.079e+09 will-it-scale.workload
0.79 +20.3% 0.95 turbostat.IPC
53.28 +3.8 57.11 mpstat.cpu.all.sys%
45.85 -3.8 42.01 mpstat.cpu.all.usr%
1.111e+11 +10.2% 1.225e+11 perf-stat.i.branch-instructions
4803948 ± 2% +9.6% 5267473 ± 5% perf-stat.i.cache-references
1.27 -17.3% 1.05 perf-stat.i.cpi
4.821e+11 +21.0% 5.833e+11 perf-stat.i.instructions
0.79 +20.9% 0.95 perf-stat.i.ipc
0.00 ± 4% -18.4% 0.00 ± 3% perf-stat.overall.MPKI
0.01 ± 60% -0.0 0.00 perf-stat.overall.branch-miss-rate%
1.27 -17.3% 1.05 perf-stat.overall.cpi
0.79 +21.0% 0.95 perf-stat.overall.ipc
73248 +15.6% 84666 perf-stat.overall.path-length
1.107e+11 +10.2% 1.221e+11 perf-stat.ps.branch-instructions
4903095 ± 2% +9.2% 5356604 ± 4% perf-stat.ps.cache-references
4.806e+11 +21.0% 5.814e+11 perf-stat.ps.instructions
1.455e+14 +21.0% 1.76e+14 perf-stat.total.instructions
4.78 ± 16% -2.7 2.06 ± 5% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.getppid
5.09 ± 9% -2.5 2.58 ± 10% perf-profile.calltrace.cycles-pp.__task_pid_nr_ns.__x64_sys_getppid.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
5.69 ± 8% -2.5 3.18 ± 10% perf-profile.calltrace.cycles-pp.__x64_sys_getppid.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
6.88 ± 3% -2.2 4.72 ± 16% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.getppid
1.36 ± 2% +1.0 2.38 ± 13% perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
0.00 +1.5 1.50 ± 15% perf-profile.calltrace.cycles-pp.getppid@plt
0.00 +1.9 1.88 ± 14% perf-profile.calltrace.cycles-pp.prandom_u32_state.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
5.46 ± 17% +2.1 7.56 ± 5% perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.getppid
10.78 ± 5% +3.8 14.62 ± 10% perf-profile.calltrace.cycles-pp.testcase
44.13 ± 7% -5.9 38.20 ± 5% perf-profile.children.cycles-pp.entry_SYSCALL_64
4.84 ± 16% -2.7 2.16 ± 6% perf-profile.children.cycles-pp.syscall_return_via_sysret
5.85 ± 9% -2.6 3.28 ± 11% perf-profile.children.cycles-pp.__x64_sys_getppid
5.17 ± 9% -2.5 2.64 ± 11% perf-profile.children.cycles-pp.__task_pid_nr_ns
6.04 ± 3% -1.8 4.22 ± 15% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
2.12 ± 10% -1.7 0.38 ± 12% perf-profile.children.cycles-pp.arch_exit_to_user_mode_prepare
98.57 -0.6 97.99 perf-profile.children.cycles-pp.getppid
0.51 ± 7% +0.3 0.83 ± 14% perf-profile.children.cycles-pp.getppid@plt
1.44 ± 2% +1.0 2.42 ± 13% perf-profile.children.cycles-pp.x64_sys_call
0.00 +1.9 1.88 ± 14% perf-profile.children.cycles-pp.prandom_u32_state
6.32 ± 5% +2.4 8.68 ± 11% perf-profile.children.cycles-pp.testcase
20.95 ± 9% +10.1 31.04 ± 9% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
23.16 ± 6% -7.0 16.11 ± 16% perf-profile.self.cycles-pp.entry_SYSCALL_64
4.79 ± 16% -2.6 2.16 ± 6% perf-profile.self.cycles-pp.syscall_return_via_sysret
5.09 ± 9% -2.5 2.60 ± 11% perf-profile.self.cycles-pp.__task_pid_nr_ns
2.04 ± 10% -1.7 0.33 ± 7% perf-profile.self.cycles-pp.arch_exit_to_user_mode_prepare
4.56 ± 3% -1.3 3.25 ± 15% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
4.30 ± 2% -1.1 3.19 ± 15% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
1.40 ± 7% +0.6 1.98 ± 12% perf-profile.self.cycles-pp.testcase
1.35 ± 2% +1.0 2.35 ± 13% perf-profile.self.cycles-pp.x64_sys_call
0.00 +1.7 1.68 ± 15% perf-profile.self.cycles-pp.prandom_u32_state
20.89 ± 9% +10.1 30.94 ± 10% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki