Re: [PATCH] x86/asm: Switch clflush alternatives to use %a address operand modifier
From: Uros Bizjak
Date: Thu Mar 19 2026 - 08:28:53 EST
On Thu, Mar 19, 2026 at 12:21 PM David Laight
<david.laight.linux@xxxxxxxxx> wrote:
>
> On Thu, 19 Mar 2026 11:45:59 +0100
> Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
>
> > On Thu, Mar 19, 2026 at 11:20 AM David Laight
> > <david.laight.linux@xxxxxxxxx> wrote:
> > >
> > > On Wed, 18 Mar 2026 16:45:28 +0100
> > > Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
> > >
> > > > On Wed, Mar 18, 2026 at 4:03 PM David Laight
> > > > <david.laight.linux@xxxxxxxxx> wrote:
> > > > >
> > > > > On Wed, 18 Mar 2026 10:08:11 +0100
> > > > > Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
> > > > >
> > > > > > The inline asm used with alternative_input() specifies the address
> > > > > > operand for clflush with the "a" input operand constraint and
> > > > > > explicit "(%[addr])" dereference:
> > > > > >
> > > > > > "clflush (%[addr])", [addr] "a" (addr)
> > > > > >
> > > > > > This forces the pointer into %rax and manually encodes the memory
> > > > > > operand in the template. Instead, use the %a address operand
> > > > > > modifier and relax the constraint from "a" to "r":
> > > > > >
> > > > > > "clflush %a[addr]", [addr] "r" (addr)
> > > > > >
> > > > > > This lets the compiler choose the register while generating the
> > > > > > correct addressing mode.
> > > > >
> > > > > Aren't these two independent changes?
> > > >
> > > > I was hoping I can put a trivial "a" -> "r" change under the "also
> > > > ..." change. OTOH, let's change the summary to "x86/asm: Improve
> > > > clflush alternatives assembly", that will also handle your proposed
> > > > addition of "memory" clobber.
> > > >
> > > > > %a saves you having to know how to write the memory reference for the
> > > > > architecture - so is the same as (%[addr]) (assuming att syntax).
> > > > > I think the assembler handles the one 'odd' case of (%rbp).
> > > >
> > > > Yes, it does, and also fixes another 'odd' case of (%r13).
> > > >
> > > > > Was there ever a reason for using "a" rather than "r" - it seems an
> > > > > unusual choice.
> > > >
> > > > Probably just an oversight due to a follow-up __monitor() that wants
> > > > its operand in %rax.
> > >
> > > Actually gcc can be quite bad are reverse tracking register requirements.
> >
> > This must be a very old GCC as I'm not aware of this deficiency.
> >
> > --cut here--
> > void foo (int a)
> > {
> > asm volatile ("# 1" : : "r" (a));
> > asm volatile ("# 2" : : "a" (a));
> > }
> >
> > void bar (int a)
> > {
> > asm volatile ("# 1" : : "a" (a));
> > asm volatile ("# 2" : : "a" (a));
> > }
> > --cut here--
> >
> > foo:
> > movl %edi, %eax
> > # 1
> > # 2
> > ret
> >
> > bar:
> > movl %edi, %eax
> > # 1
> > # 2
> > ret
> >
> > Do you perhaps have a testcase to illustrate your claim?
>
> If you look at enough gcc output you'll see places where there are register
> moves that look like they could be removed by adjusting the register
> assignments.
> I'm pretty sure Linus has commented about that as well.
> Whether it can happen in this trivial case is another matter.
>
> Oh - I can't see anything in the gcc 15.2 doc that says that the order
> of 'asm volatile' statements can't get swapped.
> I'm also pretty sure that some older (possibly very much older) versions
> definitely would swap them over.
> There might have been a post from someone saying that 'it doesn't do that
> any more', but it isn't documented.
It isn't explicitly documented. But rest assured that they won't be
scheduled around:
from gcc/sched-deps.cc:
Traditional and volatile asm instructions must be considered to use
and clobber all hard registers, all pseudo-registers and all of
memory. So must TRAP_IF and UNSPEC_VOLATILE operations.
...
reg_pending_barrier = TRUE_BARRIER;
Uros.