Re: [PATCH v2] iopoll: use udelay() for initial polling

From: Jani Nikula

Date: Thu May 21 2026 - 03:03:46 EST


On Wed, 20 May 2026, Peter Collingbourne <peter@xxxxxxxxx> wrote:
> On Wed, May 20, 2026 at 6:29 AM Ville Syrjälä
> <ville.syrjala@xxxxxxxxxxxxxxx> wrote:
>>
>> On Tue, May 19, 2026 at 03:24:46AM -0700, Peter Collingbourne wrote:
>> > A short polling delay, such as the delay of 5us
>> > (SPINAND_READ_POLL_DELAY_US) provided by the SPI NAND driver,
>> > can become a 1/HZ (order of ms) delay caused by the usleep_range()
>> > call in read_poll_timeout(), significantly reducing SPI NAND access
>> > performance. Fix it by adjusting the read_poll_timeout() macro to use
>> > udelay() to delay until 1/10 of a timer tick after it is called, and
>> > only then sleep.

What's "1/10 of a timer tick" based on?

fsleep() has simply 10 us as the threshold.

>> >
>> > Fixes: c955a0cc8a28 ("spi: spi-mem: add automatic poll status functions")
>> > Signed-off-by: Peter Collingbourne <peter@xxxxxxxxx>
>> > ---
>> > include/linux/iopoll.h | 30 ++++++++++++++++++++++--------
>> > 1 file changed, 22 insertions(+), 8 deletions(-)
>> >
>> > v2:
>> > * Fix it in read_poll_timeout() instead
>> >
>> > diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h
>> > index 53edd69acb9b..2ee89b76f072 100644
>> > --- a/include/linux/iopoll.h
>> > +++ b/include/linux/iopoll.h
>> > @@ -19,9 +19,11 @@
>> > *
>> > * @op: Operation
>> > * @cond: Break condition
>> > - * @sleep_us: Maximum time to sleep between operations in us (0 tight-loops).
>> > - * Please read usleep_range() function description for details and
>> > - * limitations.
>> > + * @sleep_us: Maximum time to sleep or delay between operations in us
>> > + * (0 tight-loops). Please read usleep_range() and udelay()
>> > + * function descriptions for details and limitations.
>> > + * This macro will delay until 1/10 of a timer tick after
>> > + * it is called, and will then start sleeping.
>> > * @timeout_us: Timeout in us, 0 means never timeout
>> > * @sleep_before_op: if it is true, sleep @sleep_us before operation.
>> > *
>> > @@ -35,11 +37,18 @@
>> > ({ \
>> > u64 __timeout_us = (timeout_us); \
>> > unsigned long __sleep_us = (sleep_us); \
>> > - ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \
>> > + ktime_t __start_time = ktime_get(); \
>> > + u64 __delay_timeout_us = 100000/HZ; \
>> > + ktime_t __delay_timeout = ktime_add_us(__start_time, __delay_timeout_us); \
>> > + ktime_t __timeout = ktime_add_us(__start_time, __timeout_us); \
>> > int ___ret; \
>> > might_sleep_if((__sleep_us) != 0); \
>> > - if ((sleep_before_op) && __sleep_us) \
>> > - usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
>> > + if ((sleep_before_op) && __sleep_us) { \
>> > + if (__sleep_us <= __delay_timeout_us) \
>> > + udelay(__sleep_us); \
>>
>> If you want udelay() why not just use the atomic variant of the macro?
>
> That's what I had in v1; we decided this approach would better handle
> misbehaving devices.
> https://lore.kernel.org/all/20260517150253.031dec09@pumpkin/

I think the problem with trying to adapt to everything within
read_poll_timeout() is that every step like this adapts to a *specific*
use case, and once it gets specific enough, it's no longer usable to
other scenarios.

Having to reimplement the whole thing in drivers is much worse than
having to do two calls.

Could a staggered approach work?

ret = read_poll_timeout_atomic("short delay/timeout")
if (ret)
ret = read_poll_timeout("longer delay/timeout")

Then you have better control of the behaviour in the driver, instead of
adapting a generic function to a specific use case.

BR,
Jani.

--
Jani Nikula, Intel