[RFC PATCH 0/4] mm/shmem: optimize read performance with folio batching

From: Chi Zhiling

Date: Fri May 15 2026 - 06:05:47 EST


From: Chi Zhiling <chizhiling@xxxxxxxxxx>

This series optimizes shmem read performance by implementing folio
batching in the read path and eliminating unnecessary lock operations.

Performance testing with fio:
(--ioengine=sync --rw=read --size=1G --runtime=120)

shmem (THP disabled):
bs=1M: 11.4 GiB/s
bs=64k: 11.1 GiB/s
bs=4k: 3814 MiB/s

shmem (THP disabled) + fbatch:
bs=1M: 12.8 GiB/s (+12%)
bs=64k: 12.3 GiB/s (+11%)
bs=4k: 3783 MiB/s (-0.8%)

shmem (THP enabled):
bs=1M: 13.8 GiB/s
bs=64k: 13.1 GiB/s
bs=4k: 3851 MiB/s

shmem (THP enabled) + fbatch:
bs=1M: 14.0 GiB/s (+1%)
bs=64k: 13.4 GiB/s (+2%)
bs=4k: 3811 MiB/s (-1%)

shmem preallocated via fallocate (THP disabled):
bs=1M: 24.0 GiB/s
bs=64k: 22.5 GiB/s
bs=4k: 4670 MiB/s

shmem preallocated via fallocate (THP disabled) + fbatch:
bs=1M: 29.3 GiB/s (+22%)
bs=64k: 26.7 GiB/s (+19%)
bs=4k: 4654 MiB/s (-0.3%)

shmem preallocated via fallocate (THP enabled):
bs=1M: 24.0 GiB/s
bs=64k: 22.9 GiB/s
bs=4k: 4698 MiB/s

shmem preallocated via fallocate (THP enabled) + fbatch:
bs=1M: 34.3 GiB/s (+43%)
bs=64k: 31.5 GiB/s (+38%)
bs=4k: 4689 MiB/s (-0.2%)


Chi Zhiling (4):
mm/shmem: add SGP_GET to get unlocked folio
mm/shmem: use SGP_GET in read operations
mm/shmem: optimize file read with folio batching
mm/shmem: make SGP_NOALLOC succeed on hole like SGP_READ

include/linux/shmem_fs.h | 5 +-
mm/khugepaged.c | 2 +-
mm/shmem.c | 132 ++++++++++++++++++++++++++++++++-------
3 files changed, 112 insertions(+), 27 deletions(-)

--
2.43.0