[linus:master] [mm] a2e0c0668a: stress-ng.memthrash.ops_per_sec 32.1% improvement

From: kernel test robot

Date: Fri May 22 2026 - 02:59:23 EST




Hello,

kernel test robot noticed a 32.1% improvement of stress-ng.memthrash.ops_per_sec on:


commit: a2e0c0668a3486f96b86c50e02872c8e94fd4f9c ("mm: migrate: requeue destination folio on deferred split queue")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:

nr_threads: 50%
testtime: 60s
test: memthrash
cpufreq_governor: performance


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260522/202605221417.62e37e78-lkp@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-14/performance/x86_64-rhel-9.4/50%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp2/memthrash/stress-ng/60s

commit:
2d028f3e4b ("selftest: memcg: skip memcg_sock test if address family not supported")
a2e0c0668a ("mm: migrate: requeue destination folio on deferred split queue")

2d028f3e4bbbfd44 a2e0c0668a3486f96b86c50e028
---------------- ---------------------------
%stddev %change %stddev
\ | \
28270 ± 2% +34.8% 38095 ± 2% stress-ng.memthrash.ops
439.06 ± 2% +32.1% 579.80 ± 2% stress-ng.memthrash.ops_per_sec
147882 -42.4% 85190 ± 3% stress-ng.time.involuntary_context_switches
267068 +6.4% 284240 stress-ng.time.maximum_resident_set_size
317026 ± 4% +419.6% 1647291 ± 8% stress-ng.time.minor_page_faults
6692 -63.9% 2412 ± 2% stress-ng.time.percent_of_cpu_this_job_got
3079 -88.7% 347.19 stress-ng.time.system_time
6295697 +33.7% 8416356 stress-ng.time.voluntary_context_switches
8.254e+09 +37.3% 1.133e+10 cpuidle..time
22013986 +38.4% 30466146 cpuidle..usage
8738 ± 15% -83.2% 1465 ± 63% perf-c2c.DRAM.local
10218 ± 15% -63.7% 3705 ± 58% perf-c2c.DRAM.remote
410.50 ± 29% +242.3% 1405 ± 57% perf-c2c.HITM.remote
19232592 -9.8% 17340551 meminfo.AnonHugePages
9561429 ± 8% +9.3% 10446848 ± 5% meminfo.DirectMap2M
404415 ± 3% -12.9% 352442 ± 6% meminfo.Mapped
1159036 +21.9% 1413282 ± 2% meminfo.SUnreclaim
1301140 +19.5% 1555377 ± 2% meminfo.Slab
62.40 +23.6 86.02 mpstat.cpu.all.idle%
1.31 ± 3% -1.3 0.02 ± 19% mpstat.cpu.all.iowait%
0.78 -0.2 0.59 mpstat.cpu.all.irq%
0.11 -0.0 0.10 mpstat.cpu.all.soft%
25.06 -21.7 3.37 mpstat.cpu.all.sys%
589495 ± 11% +25.0% 736788 ± 7% numa-meminfo.node0.SUnreclaim
19200 ± 4% -7.7% 17720 ± 6% numa-meminfo.node1.KernelStack
329434 ± 15% -35.2% 213453 ± 37% numa-meminfo.node1.Mapped
17149792 ± 9% -12.7% 14965621 ± 5% numa-meminfo.node1.MemUsed
569830 ± 12% +18.7% 676242 ± 9% numa-meminfo.node1.SUnreclaim
63.56 +36.0% 86.44 vmstat.cpu.id
2.61 ± 9% -99.3% 0.02 ± 55% vmstat.procs.b
67.19 -63.7% 24.37 ± 2% vmstat.procs.r
553107 +35.7% 750510 vmstat.system.cs
1527226 -36.3% 972506 vmstat.system.in
829785 ± 12% +106.7% 1715157 ± 9% numa-numastat.node0.local_node
1442509 ± 4% +95.5% 2819818 ± 4% numa-numastat.node0.numa_hit
612723 ± 8% +80.3% 1104660 ± 7% numa-numastat.node0.other_node
1149611 ± 8% +63.0% 1874175 ± 6% numa-numastat.node1.local_node
1739638 ± 3% +71.8% 2988324 ± 4% numa-numastat.node1.numa_hit
590024 ± 8% +88.8% 1114149 ± 8% numa-numastat.node1.other_node
10365 ± 3% -99.7% 28.54 ± 28% numa-vmstat.node0.nr_isolated_anon
147414 ± 11% +24.9% 184082 ± 7% numa-vmstat.node0.nr_slab_unreclaimable
1442492 ± 4% +95.5% 2820145 ± 4% numa-vmstat.node0.numa_hit
829769 ± 12% +106.7% 1715484 ± 9% numa-vmstat.node0.numa_local
612723 ± 8% +80.3% 1104660 ± 7% numa-vmstat.node0.numa_other
10294 ± 5% -99.6% 46.16 ± 28% numa-vmstat.node1.nr_isolated_anon
19182 ± 4% -7.7% 17707 ± 6% numa-vmstat.node1.nr_kernel_stack
82493 ± 15% -35.3% 53391 ± 37% numa-vmstat.node1.nr_mapped
142496 ± 12% +18.6% 168975 ± 9% numa-vmstat.node1.nr_slab_unreclaimable
1739254 ± 3% +71.9% 2988936 ± 4% numa-vmstat.node1.numa_hit
1149227 ± 8% +63.1% 1874787 ± 6% numa-vmstat.node1.numa_local
590024 ± 8% +88.8% 1114149 ± 8% numa-vmstat.node1.numa_other
1169 -60.6% 460.67 turbostat.Avg_MHz
36.54 -22.2 14.38 turbostat.Busy%
63.94 +22.4 86.29 turbostat.C1%
63.35 +35.0% 85.54 turbostat.CPU%c1
47.50 -7.0% 44.17 turbostat.CoreTmp
0.32 +125.1% 0.72 ± 3% turbostat.IPC
1.114e+08 -38.0% 69064256 ± 2% turbostat.IRQ
3512691 -63.2% 1293273 ± 2% turbostat.NMI
47.67 -5.6% 45.00 turbostat.PkgTmp
339.47 -24.9% 254.84 turbostat.PkgWatt
47.89 -35.1% 31.10 ± 2% turbostat.RAMWatt
0.04 -25.0% 0.03 turbostat.SysWatt
0.01 ± 11% -45.0% 0.01 ± 9% perf-sched.sch_delay.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
586.78 ± 52% -76.6% 137.29 ± 48% perf-sched.sch_delay.max.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
0.01 ± 11% -45.0% 0.01 ± 9% perf-sched.total_sch_delay.average.ms
586.78 ± 52% -76.6% 137.29 ± 48% perf-sched.total_sch_delay.max.ms
2.22 ± 2% -18.5% 1.81 ± 6% perf-sched.total_wait_and_delay.average.ms
1256396 ± 2% +24.5% 1564754 ± 5% perf-sched.total_wait_and_delay.count.ms
2835 ± 22% +36.7% 3876 ± 11% perf-sched.total_wait_and_delay.max.ms
2.21 ± 2% -18.3% 1.80 ± 6% perf-sched.total_wait_time.average.ms
2835 ± 22% +36.7% 3876 ± 11% perf-sched.total_wait_time.max.ms
2.22 ± 2% -18.5% 1.81 ± 6% perf-sched.wait_and_delay.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
1256396 ± 2% +24.5% 1564754 ± 5% perf-sched.wait_and_delay.count.[unknown].[unknown].[unknown].[unknown].[unknown]
2835 ± 22% +36.7% 3876 ± 11% perf-sched.wait_and_delay.max.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
2.21 ± 2% -18.3% 1.80 ± 6% perf-sched.wait_time.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
2835 ± 22% +36.7% 3876 ± 11% perf-sched.wait_time.max.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
6388010 -2.7% 6218187 proc-vmstat.nr_active_anon
6151592 -3.2% 5956303 proc-vmstat.nr_anon_pages
9390 -9.8% 8467 proc-vmstat.nr_anon_transparent_hugepages
19371 ± 2% -99.7% 58.14 ± 16% proc-vmstat.nr_isolated_anon
101105 ± 3% -12.9% 88112 ± 6% proc-vmstat.nr_mapped
289767 +21.9% 353298 ± 2% proc-vmstat.nr_slab_unreclaimable
6388005 -2.7% 6218187 proc-vmstat.nr_zone_active_anon
3187240 +82.5% 5815893 proc-vmstat.numa_hit
31673 ± 6% -42.5% 18217 ± 4% proc-vmstat.numa_huge_pte_updates
1984495 +81.2% 3596704 ± 2% proc-vmstat.numa_local
1202748 +84.5% 2218810 ± 2% proc-vmstat.numa_other
16271854 ± 6% -42.4% 9369899 ± 4% proc-vmstat.numa_pte_updates
1.038e+09 -98.2% 18939715 proc-vmstat.pgalloc_normal
672777 +197.7% 2002601 ± 6% proc-vmstat.pgfault
1.038e+09 -98.2% 18467962 proc-vmstat.pgfree
1.03e+09 -98.9% 11586287 ± 3% proc-vmstat.pgmigrate_success
113875 ± 12% -58.0% 47810 ± 25% proc-vmstat.pgreuse
2010888 -99.3% 13200 ± 4% proc-vmstat.thp_deferred_split_page
2012617 -99.4% 12669 ± 5% proc-vmstat.thp_migration_success
4904 ± 3% +36.5% 6696 ± 4% proc-vmstat.thp_split_pmd
14192 -50.1% 7083 ± 2% sched_debug.cfs_rq:/.avg_vruntime.avg
10454 ± 2% -75.4% 2571 ± 10% sched_debug.cfs_rq:/.avg_vruntime.min
0.19 ± 17% -50.0% 0.09 ± 7% sched_debug.cfs_rq:/.h_nr_queued.avg
0.34 ± 4% -15.2% 0.29 ± 4% sched_debug.cfs_rq:/.h_nr_queued.stddev
0.19 ± 17% -50.0% 0.09 ± 7% sched_debug.cfs_rq:/.h_nr_runnable.avg
0.34 ± 4% -15.2% 0.29 ± 4% sched_debug.cfs_rq:/.h_nr_runnable.stddev
175981 ± 18% -61.1% 68457 ± 9% sched_debug.cfs_rq:/.load.avg
254892 ± 5% -23.5% 194995 ± 9% sched_debug.cfs_rq:/.load.stddev
1357 ± 37% -57.0% 583.98 ± 36% sched_debug.cfs_rq:/.load_avg.avg
0.19 ± 16% -49.1% 0.10 ± 8% sched_debug.cfs_rq:/.nr_queued.avg
0.34 ± 4% -12.8% 0.29 ± 5% sched_debug.cfs_rq:/.nr_queued.stddev
404.25 ± 11% -33.8% 267.42 ± 2% sched_debug.cfs_rq:/.removed.runnable_avg.max
404.25 ± 11% -33.8% 267.42 ± 2% sched_debug.cfs_rq:/.removed.util_avg.max
267.52 ± 8% -47.5% 140.35 ± 4% sched_debug.cfs_rq:/.runnable_avg.avg
211.10 ± 5% +19.4% 252.05 ± 2% sched_debug.cfs_rq:/.runnable_avg.stddev
266.14 ± 8% -47.5% 139.71 ± 4% sched_debug.cfs_rq:/.util_avg.avg
210.96 ± 5% +19.2% 251.46 ± 2% sched_debug.cfs_rq:/.util_avg.stddev
75.17 ± 23% -68.0% 24.09 ± 16% sched_debug.cfs_rq:/.util_est.avg
14192 -50.1% 7083 ± 2% sched_debug.cfs_rq:/.zero_vruntime.avg
10454 ± 2% -75.4% 2571 ± 10% sched_debug.cfs_rq:/.zero_vruntime.min
619.25 +6.3% 658.23 sched_debug.cpu.clock_task.stddev
1594 ± 17% -54.9% 718.80 ± 6% sched_debug.cpu.curr->pid.avg
2781 ± 4% -19.2% 2246 ± 2% sched_debug.cpu.curr->pid.stddev
0.19 ± 15% -52.9% 0.09 ± 7% sched_debug.cpu.nr_running.avg
0.34 ± 3% -17.8% 0.28 ± 4% sched_debug.cpu.nr_running.stddev
90326 +40.3% 126764 sched_debug.cpu.nr_switches.avg
184140 ± 7% +143.7% 448758 ± 5% sched_debug.cpu.nr_switches.max
25673 ± 4% +210.3% 79661 ± 6% sched_debug.cpu.nr_switches.stddev
0.35 ± 9% +29.7% 0.45 sched_debug.cpu.nr_uninterruptible.avg
155.33 ± 13% +61.8% 251.33 ± 25% sched_debug.cpu.nr_uninterruptible.max
-31.50 +32.3% -41.67 sched_debug.cpu.nr_uninterruptible.min
20.74 ± 10% +46.9% 30.46 ± 27% sched_debug.cpu.nr_uninterruptible.stddev
25.34 -62.1% 9.60 perf-stat.i.MPKI
1.194e+10 -22.8% 9.22e+09 ± 3% perf-stat.i.branch-instructions
0.24 -0.0 0.20 ± 2% perf-stat.i.branch-miss-rate%
27059070 -39.8% 16288026 perf-stat.i.branch-misses
64.90 -7.5 57.38 perf-stat.i.cache-miss-rate%
1.79e+09 -64.7% 6.318e+08 ± 4% perf-stat.i.cache-misses
2.76e+09 -62.9% 1.025e+09 ± 3% perf-stat.i.cache-references
581182 +36.2% 791633 perf-stat.i.context-switches
3.20 -49.5% 1.62 ± 2% perf-stat.i.cpi
2.278e+11 -61.3% 8.824e+10 ± 2% perf-stat.i.cpu-cycles
4424 -80.9% 844.69 ± 2% perf-stat.i.cpu-migrations
128.63 ± 2% +118.5% 281.11 ± 3% perf-stat.i.cycles-between-cache-misses
7.238e+10 -11.1% 6.435e+10 ± 4% perf-stat.i.instructions
0.33 +153.9% 0.83 perf-stat.i.ipc
1.01 ± 24% -74.0% 0.26 ± 35% perf-stat.i.major-faults
3.02 +36.3% 4.11 perf-stat.i.metric.K/sec
8629 ± 2% +228.8% 28378 ± 6% perf-stat.i.minor-faults
8631 ± 2% +228.8% 28379 ± 6% perf-stat.i.page-faults
24.71 -59.9% 9.90 perf-stat.overall.MPKI
0.23 -0.0 0.18 ± 3% perf-stat.overall.branch-miss-rate%
64.93 -3.1 61.88 perf-stat.overall.cache-miss-rate%
3.16 -55.7% 1.40 ± 3% perf-stat.overall.cpi
128.00 +10.7% 141.70 ± 2% perf-stat.overall.cycles-between-cache-misses
0.32 +125.7% 0.71 ± 2% perf-stat.overall.ipc
1.178e+10 -22.7% 9.1e+09 ± 3% perf-stat.ps.branch-instructions
26597284 -39.8% 16001940 perf-stat.ps.branch-misses
1.766e+09 -64.4% 6.289e+08 ± 4% perf-stat.ps.cache-misses
2.72e+09 -62.7% 1.016e+09 ± 3% perf-stat.ps.cache-references
570400 +36.0% 775987 perf-stat.ps.context-switches
2.261e+11 -60.6% 8.903e+10 perf-stat.ps.cpu-cycles
4386 -80.8% 840.61 ± 2% perf-stat.ps.cpu-migrations
7.151e+10 -11.1% 6.356e+10 ± 4% perf-stat.ps.instructions
0.99 ± 24% -73.9% 0.26 ± 35% perf-stat.ps.major-faults
8539 ± 2% +227.2% 27937 ± 6% perf-stat.ps.minor-faults
8540 ± 2% +227.1% 27939 ± 6% perf-stat.ps.page-faults
4.762e+12 ± 2% -9.5% 4.311e+12 ± 4% perf-stat.total.instructions




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki