Re: [RFC PATCH net-next 1/2] net: napi: Fix interrupts permanently disabled during busy poll

From: Jakub Kicinski

Date: Tue Apr 28 2026 - 19:49:33 EST


On Tue, 28 Apr 2026 17:51:30 +0000 Dragos Tatulea wrote:
> Under certain conditions a queue can be left out with interrupts
> disabled and with the napi re-scheduling timer permanently stopped.
> This behaviour is triggered by the napi busy poll path when
> gro-flush-timeout and defer-hard-irq are set. Here's a sequence of
> operations:
>
> 1. Busy poll starts, NAPI_STATE_SCHED is set to avoid rescheduling napi
> from the timer.
>
> 2. During napi poll, driver disables interrupts due to being in poll
> mode (napi_complete_done() returns false because napi->state has
> NAPIF_STATE_IN_BUSY_POLL set).

Why does the driver have IRQs disabled in busy poll?

> 3. At the end of the busy poll (busy_poll_stop()):
> 3.1 napi timer is scheduled and skip_schedule is set (due to config)
> 3.2 napi->poll() is called:
> - driver poll() processes exactly budget packets
> and exits early => napi not scheduled.
> (interrupts are still disabled at this point)
> 3.3 Since napi poll processed budget packets, __busy_poll_stop()
> is called with skip_schedule set => napi is not scheduled here
> either.

with skip_schedule it calls:

clear_bit(NAPI_STATE_SCHED, &napi->state);

> 4. If the napi timer from 3.1 gets to be triggered due to slow napi poll
> or some other reason, the timer will run with no effect (due to
> NAPI_STATE_SCHED being set).

And here you claim STATE_SCHED is still set?