Re: [RFC PATCH] mm/mempolicy: NUMA mempolicy mismatch during remote access

From: Gregory Price

Date: Mon Mar 16 2026 - 10:11:46 EST


On Mon, Mar 16, 2026 at 08:04:24PM +0800, Lin Ruifeng wrote:
> I'd like to report an issue in the SVA I/O Page Fault (IOPF) handling path:
> a NUMA memory policy mismatch caused by deferred workqueue processing.
>
> When hardware triggers a page fault via the IOMMU SVA mechanism, it's handled
> asynchronously by a kworker thread. Although the fault handler correctly uses
> the original process's mm_struct for address space mapping, the physical page
> allocation (e.g., in do_anonymous_page()) still depends on current->mempolicy.
>
> Since current here is the kworker, not the original user process, any
> task-level NUMA policy (e.g., set_mempolicy() or numactl --membind) is
> completely ignored. Instead, allocation follows the kworker's default policy,
> which may run on a different NUMA node.
>
> A similar issue was also discussed in [1]. I was wondering if you might have
> any suggestions on how to address this issue.
>
> Link: https://lore.kernel.org/linux-mm/e2d5f3a5-f6f1-4567-a162-a0e814292738@xxxxxxxxxxxxx/
> Signed-off-by: Lin Ruifeng <linruifeng4@xxxxxxxxxx>
> 2.43.0
>

I think the best we could do in this scenario is acquire the mm_struct
process's mempolicy and plumb it through - but this is not exactly
correct either as multiple threads within a process may have different
mempolicies.

I imagine this also applies to vma policies as well - so you'd need to
check both the vma and the task. Not eactly great since we're talking
multiple locks where there were none before.

~Gregory