Re: [PATCH v10 00/12] barrier: Add smp_cond_load_{relaxed,acquire}_timeout()
From: Catalin Marinas
Date: Wed Mar 25 2026 - 11:56:53 EST
On Mon, Mar 16, 2026 at 11:37:12PM +0000, David Laight wrote:
> On Mon, 16 Mar 2026 15:08:07 -0700
> Ankur Arora <ankur.a.arora@xxxxxxxxxx> wrote:
> > However, as David Laight pointed out in this thread
> > (https://lore.kernel.org/lkml/20260214113122.70627a8b@pumpkin/)
> > that this would be fine so long as the polling is on memory, but would
> > need some work to handle MMIO.
>
> I'm not sure the current code works with MMIO on arm64.
It won't but also passing an MMIO pointer to smp_cond_load() is wrong in
general. You'd need a new API that takes an __iomem pointer.
> I was looking at the osq_lock() code, it uses smp_cond_load() with 'expr'
> being 'VAL || need_resched()' expecting to get woken by the IPI associated
> with the preemption being requested.
> But the arm64 code relies on 'wfe' being woken when the memory write
> 'breaks' the 'ldx' for the monitored location.
> That will only work for cached addresses.
Even worse, depending on the hardware, you may even get a data abort
when attempting LDXR on Device memory.
> For osq_lock(), while an IPI will wake it up, there is also a small timing
> window where the IPI can happen before the ldx and so not actually wake up it.
> This is true whenever 'expr' is non-trivial.
Hmm, I thought this is fine because of the implicit SEVL on exception
return but the arm64 __cmpwait_relaxed() does a SEVL+WFE which clears
any prior event, it can wait in theory forever when the event stream is
disabled.
Expanding smp_cond_load_relaxed() into asm, we have something like:
LDR X0, [PTR]
condition check for VAL || need_resched() with branch out
SEVL
WFE
LDXR X1, [PTR]
EOR X1, X1, X0
CBNZ out
WFE
out:
If the condition is updated to become true (need_resched()) after the
condition check but before the first WFE while *PTR remains unchanged,
the IPI won't do anything. Maybe we should revert 1cfc63b5ae60 ("arm64:
cmpwait: Clear event register before arming exclusive monitor"). Not
great but probably better than reverting f5bfdc8e3947 ("locking/osq: Use
optimized spinning loop for arm64")).
Using SEV instead of IPI would have the same problem.
--
Catalin