Re: [PATCH] x86/asm: Switch clflush alternatives to use %a address operand modifier
From: David Laight
Date: Thu Mar 19 2026 - 07:22:15 EST
On Thu, 19 Mar 2026 11:45:59 +0100
Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
> On Thu, Mar 19, 2026 at 11:20 AM David Laight
> <david.laight.linux@xxxxxxxxx> wrote:
> >
> > On Wed, 18 Mar 2026 16:45:28 +0100
> > Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
> >
> > > On Wed, Mar 18, 2026 at 4:03 PM David Laight
> > > <david.laight.linux@xxxxxxxxx> wrote:
> > > >
> > > > On Wed, 18 Mar 2026 10:08:11 +0100
> > > > Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
> > > >
> > > > > The inline asm used with alternative_input() specifies the address
> > > > > operand for clflush with the "a" input operand constraint and
> > > > > explicit "(%[addr])" dereference:
> > > > >
> > > > > "clflush (%[addr])", [addr] "a" (addr)
> > > > >
> > > > > This forces the pointer into %rax and manually encodes the memory
> > > > > operand in the template. Instead, use the %a address operand
> > > > > modifier and relax the constraint from "a" to "r":
> > > > >
> > > > > "clflush %a[addr]", [addr] "r" (addr)
> > > > >
> > > > > This lets the compiler choose the register while generating the
> > > > > correct addressing mode.
> > > >
> > > > Aren't these two independent changes?
> > >
> > > I was hoping I can put a trivial "a" -> "r" change under the "also
> > > ..." change. OTOH, let's change the summary to "x86/asm: Improve
> > > clflush alternatives assembly", that will also handle your proposed
> > > addition of "memory" clobber.
> > >
> > > > %a saves you having to know how to write the memory reference for the
> > > > architecture - so is the same as (%[addr]) (assuming att syntax).
> > > > I think the assembler handles the one 'odd' case of (%rbp).
> > >
> > > Yes, it does, and also fixes another 'odd' case of (%r13).
> > >
> > > > Was there ever a reason for using "a" rather than "r" - it seems an
> > > > unusual choice.
> > >
> > > Probably just an oversight due to a follow-up __monitor() that wants
> > > its operand in %rax.
> >
> > Actually gcc can be quite bad are reverse tracking register requirements.
>
> This must be a very old GCC as I'm not aware of this deficiency.
>
> --cut here--
> void foo (int a)
> {
> asm volatile ("# 1" : : "r" (a));
> asm volatile ("# 2" : : "a" (a));
> }
>
> void bar (int a)
> {
> asm volatile ("# 1" : : "a" (a));
> asm volatile ("# 2" : : "a" (a));
> }
> --cut here--
>
> foo:
> movl %edi, %eax
> # 1
> # 2
> ret
>
> bar:
> movl %edi, %eax
> # 1
> # 2
> ret
>
> Do you perhaps have a testcase to illustrate your claim?
If you look at enough gcc output you'll see places where there are register
moves that look like they could be removed by adjusting the register
assignments.
I'm pretty sure Linus has commented about that as well.
Whether it can happen in this trivial case is another matter.
Oh - I can't see anything in the gcc 15.2 doc that says that the order
of 'asm volatile' statements can't get swapped.
I'm also pretty sure that some older (possibly very much older) versions
definitely would swap them over.
There might have been a post from someone saying that 'it doesn't do that
any more', but it isn't documented.
David
>
> > So forcing 'addr' into %rax for the cflush might actually remove
> > a register move before the monitor.
> > Indeed, were it to pick a different register there will always be a
> > extra register move.
> > If the value is in a different register (eg from a function call)
> > then you'll move the register move instruction - but there'll still
> > be one.
> >
> > So I suspect this change can never improve the code.
>
> Of course, there will always be a register move in the above case, but
> please look at [1].
>
> [1] https://claude.ai/share/cf559f66-dfcf-451a-8260-6f687aead052
>
> Uros.