Re: [Patch v8 11/23] perf/x86: Enable XMM register sampling for REGS_USER case

From: Mi, Dapeng

Date: Mon Jun 01 2026 - 01:55:35 EST



On 5/29/2026 7:42 PM, Peter Zijlstra wrote:
> On Fri, May 29, 2026 at 03:56:33PM +0800, Dapeng Mi wrote:
>> This patch adds support for XMM register sampling in the REGS_USER case.
>>
>> To handle simultaneous sampling of XMM registers for both REGS_INTR and
>> REGS_USER cases, a per-CPU `x86_user_regs` is introduced to store
>> REGS_USER-specific XMM registers. This prevents REGS_USER-specific XMM
>> register data from being overwritten by REGS_INTR-specific data if they
>> share the same `x86_perf_regs` structure.
>>
>> To sample user-space XMM registers, the `x86_pmu_update_user_xregs()`
>> helper function is added. It checks if the `TIF_NEED_FPU_LOAD` flag is
>> set. If so, the user-space XMM register data can be directly retrieved
>> from the cached task FPU state, as the corresponding hardware registers
>> have been cleared or switched to kernel-space data. Otherwise, the data
>> must be read from the hardware registers using the `xsaves` instruction.
>>
>> For PEBS events, `x86_pmu_update_user_xregs()` checks if the PEBS-sampled
>> XMM register data belongs to user-space. If so, no further action is
>> needed. Otherwise, the user-space XMM register data needs to be
>> re-sampled using the same method as for non-PEBS events.
>>
>> Co-developed-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@xxxxxxxxxxxxxxx>
> Sashiko has fun comments; I don't think we care about the cross-vm data
> leak, that's not worse than we already have on the regular regs. In
> fact, it might be considered correct behaviour ;-)

Yes, it suppose to be the correct behavior and what is intended. User may
want to profile the guest's behavior by host events, this is what current
patch does.


>
> It does have a point about noxsaves; or xsaves being masked by a VM.

Yes, as previous comments, I would enhance the check and only set the
PERF_PMU_CAP_EXTENDED_REGS capabilities when both PEBS and xsaves support
XMM sampling.

Thanks.