Re: [PATCH] net_sched: codel: fix stale state for empty flows in fq_codel

From: Eric Dumazet

Date: Mon Mar 23 2026 - 14:49:10 EST


On Mon, Mar 23, 2026 at 10:49 AM <hawk@xxxxxxxxxx> wrote:
>
> From: Jonas Köppeler <j.koeppeler@xxxxxxxxxxxx>
>
> When codel_dequeue() finds an empty queue, it resets vars->dropping
> but does not reset vars->first_above_time. The reference CoDel
> algorithm (Nichols & Jacobson, ACM Queue 2012) resets both:
>
> dodeque_result codel_queue_t::dodeque(time_t now) {
> ...
> if (r.p == NULL) {
> first_above_time = 0; // <-- Linux omits this
> }
> ...
> }
>
> Note that codel_should_drop() does reset first_above_time when called
> with a NULL skb, but codel_dequeue() returns early before ever calling
> codel_should_drop() in the empty-queue case. The post-drop code paths
> do reach codel_should_drop(NULL) and correctly reset the timer, so a
> dropped packet breaks the cycle -- but the next delivered packet
> re-arms first_above_time and the cycle repeats.
>
> For sparse flows such as ICMP ping (one packet every 200ms-1s), the
> first packet arms first_above_time, the flow goes empty, and the
> second packet arrives after the interval has elapsed and gets dropped.
> The pattern repeats, producing sustained loss on flows that are not
> actually congested.
>
> Test: veth pair, fq_codel, BQL disabled, 30000 iptables rules in the
> consumer namespace (NAPI-64 cycle ~14ms, well above fq_codel's 5ms
> target), ping at 5 pps under UDP flood:
>
> Before fix: 26% ping packet loss
> After fix: 0% ping packet loss
>
> Fix by resetting first_above_time to zero in the empty-queue path
> of codel_dequeue(), matching the reference algorithm.
>
> Fixes: 76e3cc126bb2 ("codel: Controlled Delay AQM")
> Fixes: d068ca2ae2e6 ("codel: split into multiple files")
> Co-developed-by: Jesper Dangaard Brouer <hawk@xxxxxxxxxx>
> Signed-off-by: Jesper Dangaard Brouer <hawk@xxxxxxxxxx>
> Signed-off-by: Jonas Köppeler <j.koeppeler@xxxxxxxxxxxx>
> Reported-by: Chris Arges <carges@xxxxxxxxxxxxxx>
> Tested-by: Jonas Köppeler <j.koeppeler@xxxxxxxxxxxx>
> Reviewed-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx>
> Link: https://lore.kernel.org/all/20260318134826.1281205-7-hawk@xxxxxxxxxx/
> ---
> include/net/codel_impl.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/include/net/codel_impl.h b/include/net/codel_impl.h
> index 78a27ac73070..b2c359c6dd1b 100644
> --- a/include/net/codel_impl.h
> +++ b/include/net/codel_impl.h
> @@ -158,6 +158,7 @@ static struct sk_buff *codel_dequeue(void *ctx,
> bool drop;
>
> if (!skb) {
> + vars->first_above_time = 0;
> vars->dropping = false;
> return skb;
> }

Nice catch, thanks.

I guess I wanted to avoid the ktime_get() cost as much as I could....

Reviewed-by: Eric Dumazet <edumazet@xxxxxxxxxx>