Re: [PATCH v2 0/5] mm/shmem: optimize read with reduced xarray lookups and folio batching

From: Andrew Morton

Date: Mon Jun 01 2026 - 20:43:55 EST


On Mon, 1 Jun 2026 13:56:59 +0800 Chi Zhiling <chizhiling@xxxxxxx> wrote:

> From: Chi Zhiling <chizhiling@xxxxxxxxxx>
>
> This series improves shmem read performance by implementing folio
> batching in the read path and reducing unnecessary xarray lookups.
>
> Performance Results:
>
> fio --ioengine=sync --rw=read --bs=$1 --size=1G --runtime=180 --time_based --group_reporting --name=seq_read_test --filename=testfile
>
> | THP disabled in tmpfs | v7.1-rc5 | v7.1-rc5 + fbatch | Improvement |
> | ---------------------- | ------------ | ----------------- | ----------- |
> | 1M + normal file | bw=11.5GiB/s | bw=12.7GiB/s | +10.4% |
> | 64k + normal file | bw=11.0GiB/s | bw=12.3GiB/s | +11.8% |
> | 4k + normal file | bw=3826MiB/s | bw=3849MiB/s | +0.6% |
> | 1M + fallocated file | bw=23.8GiB/s | bw=28.6GiB/s | +20.2% |
> | 64k + fallocated file | bw=22.5GiB/s | bw=27.3GiB/s | +21.3% |
> | 4k + fallocated file | bw=4655MiB/s | bw=4680MiB/s | +0.5% |
> | 1M + hole | bw=24.2GiB/s | bw=28.6GiB/s | +18.2% |
> | 64k + hole | bw=22.6GiB/s | bw=27.6GiB/s | +22.1% |
> | 4k + hole | bw=4652MiB/s | bw=4489MiB/s | -3.5% |
>
>
> | THP enabled in tmpfs | v7.1-rc5 | v7.1-rc5 + fbatch | Improvement |
> | --------------------- | ------------ | ----------------- | ----------- |
> | 1M + normal file | bw=13.7GiB/s | bw=13.9GiB/s | +1.4% |
> | 64k + normal file | bw=13.5GiB/s | bw=13.5GiB/s | +0.0% |
> | 4k + normal file | bw=3833MiB/s | bw=3859MiB/s | +0.7% |
> | 1M + fallocated file | bw=24.9GiB/s | bw=34.2GiB/s | +37.3% |
> | 64k + fallocated file | bw=23.0GiB/s | bw=31.4GiB/s | +36.5% |
> | 4k + fallocated file | bw=4710MiB/s | bw=4655MiB/s | -1.2% |
> | 1M + hole | bw=24.3GiB/s | bw=34.5GiB/s | +42.0% |
> | 64k + hole | bw=23.5GiB/s | bw=31.1GiB/s | +32.3% |
> | 4k + hole | bw=4690MiB/s | bw=4647MiB/s | -0.9% |
>

That looks nice.

Microbenchmarks are useful, but are you able to help us understand how
much benefit our users might see in real-world workloads?

I'll take no action at this time - it's late in the cycle and reviewers
have yet to participate.

AI review flagged a few possible issues, so please take a look:
https://sashiko.dev/#/patchset/20260601055704.167436-1-chizhiling@xxxxxxx