Re: [PATCH] compiler: Simplify generic RELOC_HIDE()
From: Nathan Chancellor
Date: Mon Mar 23 2026 - 19:29:18 EST
On Thu, Mar 19, 2026 at 02:52:38PM +0100, Marco Elver wrote:
> When enabling Context Analysis (CONTEXT_ANALYSIS := y) in arch/x86/kvm
> code, Clang's Thread Safety Analysis failed to recognize that identical
> per_cpu() accesses refer to the same lock:
>
> | CC [M] arch/x86/kvm/vmx/posted_intr.o
> | arch/x86/kvm/vmx/posted_intr.c:186:2: error: releasing raw_spinlock '__ptr + __per_cpu_offset[vcpu->cpu]' that was not held [-Werror,-Wthread-safety-analysis]
> | 186 | raw_spin_unlock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu));
> | | ^
> | ./include/linux/spinlock.h:276:32: note: expanded from macro 'raw_spin_unlock'
> | 276 | #define raw_spin_unlock(lock) _raw_spin_unlock(lock)
> | | ^
> | arch/x86/kvm/vmx/posted_intr.c:207:1: error: raw_spinlock '__ptr + __per_cpu_offset[vcpu->cpu]' is still held at the end of function [-Werror,-Wthread-safety-analysis]
> | 207 | }
> | | ^
> | arch/x86/kvm/vmx/posted_intr.c:182:2: note: raw_spinlock acquired here
> | 182 | raw_spin_lock_nested(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu),
> | | ^
> | ./include/linux/spinlock.h:235:2: note: expanded from macro 'raw_spin_lock_nested'
> | 235 | _raw_spin_lock(((void)(subclass), (lock)))
> | | ^
> | 2 errors generated.
>
> This occurred because the default RELOC_HIDE() implementation (used by
> the per-CPU macros) is a statement expression containing an intermediate
> 'unsigned long' variable (this version appears to predate Git history).
>
> While the analysis strips away inner casts when resolving pointer
> aliases, it stops when encountering intermediate non-pointer variables
> (this is Thread Safety Analysis specific and irrelevant for codegen).
> This prevents the analysis from concluding that the pointers passed to
> e.g. raw_spin_lock() and raw_spin_unlock() were identical when per-CPU
> accessors are used.
>
> Simplify RELOC_HIDE() to a single expression. This preserves the intent
> of obfuscating UB-introducing out-of-bounds pointer calculations from
> the compiler via the 'unsigned long' cast, but allows the alias analysis
> to successfully resolve the pointers.
>
> Using a recent Clang version, I observe that generated code remains the
> same for vmlinux; the intermediate variable was already being optimized
> away (for any respectable modern compiler, not doing so would be an
> optimizer bug). Note that GCC provides its own version of RELOC_HIDE(),
> so this change only affects Clang builds.
>
> Add a test case to lib/test_context-analysis.c to catch any regressions.
>
> Link: https://lore.kernel.org/all/e3946223-4543-4a76-a328-9c6865e95192@xxxxxxx/
> Reported-by: Bart Van Assche <bvanassche@xxxxxxx>
> Reported-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> Signed-off-by: Marco Elver <elver@xxxxxxxxxx>
Reviewed-by: Nathan Chancellor <nathan@xxxxxxxxxx>
> ---
> include/linux/compiler.h | 5 +----
> lib/test_context-analysis.c | 11 +++++++++++
> 2 files changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index af16624b29fd..cb2f6050bdf7 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -149,10 +149,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> #endif
>
> #ifndef RELOC_HIDE
> -# define RELOC_HIDE(ptr, off) \
> - ({ unsigned long __ptr; \
> - __ptr = (unsigned long) (ptr); \
> - (typeof(ptr)) (__ptr + (off)); })
> +# define RELOC_HIDE(ptr, off) ((typeof(ptr))((unsigned long)(ptr) + (off)))
> #endif
>
> #define absolute_pointer(val) RELOC_HIDE((void *)(val), 0)
> diff --git a/lib/test_context-analysis.c b/lib/test_context-analysis.c
> index 140efa8a9763..06b4a6a028e0 100644
> --- a/lib/test_context-analysis.c
> +++ b/lib/test_context-analysis.c
> @@ -596,3 +596,14 @@ static void __used test_ww_mutex_lock_ctx(struct test_ww_mutex_data *d)
>
> ww_mutex_destroy(&d->mtx);
> }
> +
> +static DEFINE_PER_CPU(raw_spinlock_t, test_per_cpu_lock);
> +
> +static void __used test_per_cpu(int cpu)
> +{
> + raw_spin_lock(&per_cpu(test_per_cpu_lock, cpu));
> + raw_spin_unlock(&per_cpu(test_per_cpu_lock, cpu));
> +
> + raw_spin_lock(per_cpu_ptr(&test_per_cpu_lock, cpu));
> + raw_spin_unlock(per_cpu_ptr(&test_per_cpu_lock, cpu));
> +}
> --
> 2.53.0.851.ga537e3e6e9-goog
>