Re: [PATCH] x86/mm: resize user_pcid_flush_mask for PTI / broadcast TLB flush combination

From: Rik van Riel
Date: Sat May 17 2025 - 17:30:36 EST


On Sat, 2025-05-17 at 09:59 +0200, Ingo Molnar wrote:
>
> CONFIG_X86_TLB_BROADCAST_TLB_FLUSH doesn't actually exist, the name
> is
> CONFIG_BROADCAST_TLB_FLUSH.
>
Argh, cut'n'pasted from the wrong tree :(

>
> we could make this a more obvious:
>

> And we can drop the ugly & fragile type cast in
> invalidate_user_asid():
>
> - __set_bit(kern_pcid(asid),
> -   (unsigned long
> *)this_cpu_ptr(&cpu_tlbstate.user_pcid_flush_mask));
>
> + __set_bit(kern_pcid(asid),
> this_cpu_ptr(cpu_tlbstate.user_pcid_flush_mask));
>
That is a really nice improvement, and it almost
works, too ;)

In file included from ./arch/x86/include/asm/bitops.h:430,
from ./include/linux/bitops.h:68:
./include/asm-generic/bitops/instrumented-non-atomic.h:26:54: note:
expected ‘volatile long unsigned int *’ but argument is of type ‘long
unsigned int (*)[32]’
26 | ___set_bit(unsigned long nr, volatile unsigned long *addr)
| ~~~~~~~~~~~~~~~~~~~~~~~~^~~~


I ended up settling for this:

__set_bit(kern_pcid(asid),
this_cpu_ptr(&cpu_tlbstate.user_pcid_flush_mask[0]));

> 3)
>
> If we are going to grow user_pcid_flush_mask from 2 bytes to 256
> bytes
> then please reorder 'struct tlb_state' for cache efficiency: at
> minimum
> the ::cr4 shadow should move to before ::user_pcid_flush_mask. But I
> think we should probably move user_pcid_flush_mask to the end of the
> structure, where it does the least damage to cache layout.

Done. V2 to follow in another email.

Thank you!

--
All Rights Reversed.