Re: [PATCH v5] mm/userfaultfd: detect VMA type change after copy retry in mfill_copy_folio_retry()

From: Peter Xu

Date: Thu Apr 09 2026 - 13:14:18 EST


On Thu, Apr 09, 2026 at 01:06:53PM +0100, David Carlier wrote:
> mfill_copy_folio_retry() drops mmap_lock for the copy_from_user() call.
> During this window, the VMA can be replaced with a different type (e.g.
> hugetlb), making the caller's ops pointer stale. Subsequent use of the
> stale ops can lead to incorrect folio handling or a kernel crash.
>
> Pass the caller's ops into mfill_copy_folio_retry() and compare against
> the current vma_uffd_ops() after re-acquiring the lock. Return -EAGAIN
> if they differ so the operation can be retried.
>
> Fixes: 59da5c32ffa3 ("userfaultfd: mfill_atomic(): remove retry logic")
> Signed-off-by: David Carlier <devnexen@xxxxxxxxx>
> ---
> mm/userfaultfd.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
> index 481ec7eb4442..214923a411c1 100644
> --- a/mm/userfaultfd.c
> +++ b/mm/userfaultfd.c
> @@ -443,7 +443,9 @@ static int mfill_copy_folio_locked(struct folio *folio, unsigned long src_addr)
> return ret;
> }
>
> -static int mfill_copy_folio_retry(struct mfill_state *state, struct folio *folio)
> +static int mfill_copy_folio_retry(struct mfill_state *state,
> + const struct vm_uffd_ops *ops,
> + struct folio *folio)
> {
> unsigned long src_addr = state->src_addr;
> void *kaddr;
> @@ -465,6 +467,14 @@ static int mfill_copy_folio_retry(struct mfill_state *state, struct folio *folio
> if (err)
> return err;
>
> + /*
> + * The VMA type may have changed while the lock was dropped
> + * (e.g. replaced with a hugetlb mapping), making the caller's
> + * ops pointer stale.
> + */
> + if (vma_uffd_ops(state->vma) != ops)
> + return -EAGAIN;

I agree with -EAGAIN here, but we discussed over all the things on possible
inode change and I don't know why we don't consider that.

I still think those should be considered.

If the vma snapshot idea is not welcomed, fine. We need to think of
something to cover those too. Current patch won't cover "ops unchaged" but
"inode changed", or offset changed, for example.

Thanks,

> +
> err = mfill_establish_pmd(state);
> if (err)
> return err;
> @@ -495,7 +505,7 @@ static int __mfill_atomic_pte(struct mfill_state *state,
> * will take care of unlocking if needed.
> */
> if (unlikely(ret)) {
> - ret = mfill_copy_folio_retry(state, folio);
> + ret = mfill_copy_folio_retry(state, ops, folio);
> if (ret)
> goto err_folio_put;
> }
> --
> 2.53.0
>

--
Peter Xu