[PATCH v1 1/5] mm/filemap: reduce unnecessary xarray lookups when read cached pages
From: Chi Zhiling
Date: Wed May 20 2026 - 06:25:34 EST
From: Chi Zhiling <chizhiling@xxxxxxxxxx>
When reading small amounts of data from the page cache, only a single
folio is typically returned from filemap_read_get_batch(). In this case,
calling xas_advance() or xas_next() after adding the folio to the batch
is unnecessary and only introduces extra branches.
The same issue exists for large reads, where one additional xarray walk
is always performed before termination.
Move the boundary check to after the folio is added to the batch so the
final redundant xarray advancement can be avoided. This significantly
reduces the branch count in the read path.
xas_next() does not update xa_index when xas->xa_node is set to
XAS_RESTART, so checking the boundary before updating xa_index is
sufficient to keep the folio within range. The warning should therefore
never trigger.
The branch count:
654.198 M/sec -> 646.444 M/sec
Performance counter stats for 'fio --ioengine=sync --rw=read --bs=4k --size=1G
--runtime=300 --time_based --group_reporting --name=seq_read_test --filename=file':
before:
READ: bw=2697MiB/s (2828MB/s), 2697MiB/s-2697MiB/s (2828MB/s-2828MB/s), io=790GiB (848GB), run=300001-300001msec
245602051556 task-clock # 0.821 CPUs utilized
78467 context-switches # 319.488 /sec
40 cpu-migrations # 0.163 /sec
3388 page-faults # 13.795 /sec
758312319204 instructions # 0.74 insn per cycle
1025881497502 cycles # 4.177 GHz
160672383734 branches # 654.198 M/sec
361904512 branch-misses # 0.23% of all branches
after:
READ: bw=2709MiB/s (2841MB/s), 2709MiB/s-2709MiB/s (2841MB/s-2841MB/s), io=794GiB (852GB), run=300000-300000msec
243985503670 task-clock # 0.812 CPUs utilized
79004 context-switches # 323.806 /sec
30 cpu-migrations # 0.123 /sec
3355 page-faults # 13.751 /sec
747830935069 instructions # 0.73 insn per cycle
1019609333322 cycles # 4.179 GHz
157722976668 branches # 646.444 M/sec
348984893 branch-misses # 0.22% of all branches
Signed-off-by: Chi Zhiling <chizhiling@xxxxxxxxxx>
---
mm/filemap.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index 4e636647100c..d54450e529bd 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2458,12 +2458,16 @@ static void filemap_get_read_batch(struct address_space *mapping,
{
XA_STATE(xas, &mapping->i_pages, index);
struct folio *folio;
+ pgoff_t next;
+
+ if (unlikely(index > max))
+ return;
rcu_read_lock();
for (folio = xas_load(&xas); folio; folio = xas_next(&xas)) {
if (xas_retry(&xas, folio))
continue;
- if (xas.xa_index > max || xa_is_value(folio))
+ if (xa_is_value(folio) || WARN_ON(xas.xa_index > max))
break;
if (xa_is_sibling(folio))
break;
@@ -2479,7 +2483,11 @@ static void filemap_get_read_batch(struct address_space *mapping,
break;
if (folio_test_readahead(folio))
break;
- xas_advance(&xas, folio_next_index(folio) - 1);
+
+ next = folio_next_index(folio);
+ if (next > max)
+ break;
+ xas_advance(&xas, next - 1);
continue;
put_folio:
folio_put(folio);
--
2.43.0