Re: [RFC PATCH 1/1] mm: batch page copies in folio_copy() and folio_mc_copy()
From: Garg, Shivank
Date: Tue May 19 2026 - 01:43:38 EST
On 5/18/2026 7:50 PM, David Laight wrote:
> On Mon, 18 May 2026 10:43:22 +0200
> "David Hildenbrand (Arm)" <david@xxxxxxxxxx> wrote:
>
> ...
>>> Another option is to leave memcpy() untouched for this series and add
>>> a new copy_pages() helper that the folio copy path can use. It would
>>> use ALTERNATIVE_2 that picks rep movsb on ERMS/FSRM and rep movsq on
>>> REP_GOOD and per-page copy_page() loop as the final fallback.
>>
>> That would fit the clear_pages() design we have. But if that's avoidable, that
>> would be nice.
>>
>
> For full pages 'rep movsq' is likely to be 'almost the best' on all cpu.
> The fixed overhead is amortised over a lot of copies so has little impact.
> (My brain suggests a value of 30 clocks - ignoring P4 netburst.)
> For Intel cpu the aligned destination will double throughput.
>
> I did a load of benchmarking of 'rep movsb' on my Zen-5.
> (I should be able to find the results again.)
> The real oddity was copies where (something like):
> 0 < (dst - src) & 4095 < 128
> when the startup time was a lot longer and the copy ran massively slower.
>
> I need to run those tests on some other cpu.
> However I don't have any older AMD ones (except a piledriver) or Intel
> ones newer than an i7-7 (Kaby lake?).
> (I need to get my Apollo Lake N3350 into the test set for comparison.)
> From what I remember of some earlier benchmarking (which failed to
> measure the fixed cost properly) even Sandy bridge handles 'rep movsb'
> and 'rep movsq' the same way.
>
> The problem with memcpy() is you want a hint from the source about the
> likely length and any alignment assumptions.
> Otherwise the costs of the conditionals become significant.
Hi David,
I have not benchmarked the specific case you mentioned. I tested only the aligned case.
If you can share your benchmarking script or test-cases, I'm happy to run it across the AMD
microarchitectures I have access to.
Thanks,
Shivank