Re: [PATCH v3 00/13] mm/huge_memory: refactor zap_huge_pmd()

From: Lorenzo Stoakes (Oracle)

Date: Mon Mar 23 2026 - 08:16:44 EST


I really don't want to have to reply to every bit of noise generated by
Sashiko, but for the sake of it, the stuff I'm not changing:

4/13

>Are other PMD handlers vulnerable to this same userfaultfd bug?

That's completely out of scope for the series. It's not a regression, it's
a suggestion for future work and so shouldn't be labelled as such by it,
instead of listing a 'high' priority regression.

7/13

>Is it possible for the error path immediately preceding this block to trigger
>a NULL pointer dereference?

You're already at the point where something has gone horribly wrong and a
bug in the kernel renders it unstable. It's not really worth trying to
avoid every possible bad outcome here in case of kernel bugs, that's not
practical and not worth the maintenance load to put such things in.

Also if I add this, it breaks further refactorings for the sake of
defending against a specific class of kernel bug - that's not worthwhile.

10/13

>Could evaluating folio_is_device_private(folio) cause issues if the PMD
>contains a migration entry rather than a device private entry?

If this is even possible (I don't think so), that was an existing
issue. This is a refactoring, it'd not be appropriate for me to
fundamentally change behaviour at the same time.

>This isn't a bug, but the comment still refers to flush_needed, which was
> renamed to is_present in this patch.

Baolin already raised, and I don't think it really matters to leave that
comment in there as it's removed in a later commit, and a comment isn't a
bisection hazard :)

11/13

>This isn't a bug, but the kernel-doc for pmd_is_valid_softleaf() states
>that it asserts the validity of the entry, while the function strictly

This is a turn of phrase...! Anybody wondering can go read the function.

>returns a boolean without triggering any warnings or bugs.
>Would it be better to update this comment to reflect the actual behavior,
>especially now that an actual assertion has been added to the neighboring
>pmd_to_softleaf_folio() function?

I think the CONFIG_DEBUG_VM assert itself is pretty good documentation of
there being a CONFIG_DEBUG_VM assert honestly. Should the kdoc comments
list the code too?

>Could this warning be written to evaluate the condition directly?
>if (VM_WARN_ON_ONCE(!softleaf_is_valid_pmd_entry(entry))) {
> return NULL;
>}
>When VM_WARN_ON_ONCE(true) is placed inside an if block, the kernel's
>warning machinery stringifies and prints "true" as the failing condition
>in the backtrace, which makes debugging more difficult. Wrapping the actual
>condition inside the warning macro ensures the specific violated constraint
>is visible in the console output.

This is a silly comment anyway, you can figure out why the thing failed
very easily and this is a common pattern in the kernel.

But this is also a hallucination, VM_WARN_ON_ONCE() is defined as:

#define VM_WARN_ON_ONCE(cond) (void)WARN_ON_ONCE(cond)

So you know, that won't work, and even if I did it's a silly and pedantic
comment. Plus you don't use {} for single line branches...

12/13:

While the comment about deposit was valid, the comment:

>For non-DAX special VMAs, this also forces has_deposit to true even if
>the architecture does not need a deposit, potentially attempting to free a
>non-existent deposit.

Is another hallucination.:

else if (is_huge_zero_pmd(orig_pmd))
has_deposit = !vma_is_dax(vma);

This is the line it's discussing. So we're explicitly gating on
is_huge_zero_pmd(). It's not any 'special' VMA.

And in the original code:

} else if (is_huge_zero_pmd(orig_pmd)) {
if (!vma_is_dax(vma) || arch_needs_pgtable_deposit())
zap_deposited_table(tlb->mm, pmd);
...
}

With the fix-patch applied this is 'has_deposit = has_deposit ||
!vma_is_dax()' where has_deposit is initialised with
arch_needs_pgtable_deposit(), so the logic matches.

--

Dealing with the above has taken a lot of time that I would rather spend
doing other things.

AI can asymmetrically drown us with this kind of stuff, radically
increasing workload.

This continues to be my primary concern with the application of AI, and the
only acceptable use of it will be in cases where we are able to filter
things well enough to avoid wasting people's time like this.

As I've said before, burnout continues to be a major hazard that is simply
being ignored in mm, and I don't think that's a healthy or good thing.

Let's please be considerate as to the opinions of those actually doing the
work.

Thanks, Lorenzo