Re: [PATCH net-next v6 5/5] veth: time-based BQL completion coalescing via ethtool tx-usecs

From: Simon Schippers

Date: Tue Jun 02 2026 - 11:58:32 EST


On 6/2/26 09:24, Jonas Köppeler wrote:
> On 6/1/26 6:16 PM, Simon Schippers wrote:
>
>> On 6/1/26 16:03, Jonas Köppeler wrote:
>>> On 6/1/26 2:00 PM, Simon Schippers wrote:
>>>> On 5/28/26 09:46, Jonas Köppeler wrote:
>>>>> On 5/27/26 3:54 PM, hawk@xxxxxxxxxx wrote:
>>>>>> From: Simon Schippers <simon.schippers@xxxxxxxxxxxxxx>
>>>>>>
>>>>>> Per-packet BQL completion forces DQL to converge on limit=2, causing
>>>>>> excessive NAPI scheduling overhead and qdisc requeues.
>>>>>>
>>>>>> Accumulate BQL completions and flush them when a configurable time
>>>>>> threshold is exceeded, letting DQL discover a limit that bounds actual
>>>>>> queuing delay to the configured interval. Coalescing state persists
>>>>>> across NAPI polls in struct veth_rq so completions can accumulate
>>>>>> beyond a single budget=64 cycle.
>>>>>>
>>>>>> Add ethtool tx-usecs support for runtime tuning. Default is 100 us;
>>>>>> setting tx-usecs to 0 disables coalescing and falls back to per-packet
>>>>>> completion.
>>>>>>
>>>>>> ethtool -C <veth-dev> tx-usecs 500 # 500us coalescing
>>>>>> ethtool -C <veth-dev> tx-usecs 0 # per-packet (no coalescing)
>>>>>>
>>>>>> Co-developed-by: Jesper Dangaard Brouer <hawk@xxxxxxxxxx>
>>>>>> Signed-off-by: Jesper Dangaard Brouer <hawk@xxxxxxxxxx>
>>>>>> Signed-off-by: Simon Schippers <simon.schippers@xxxxxxxxxxxxxx>
>>>>> Tested-by: Jonas Köppeler<j.koeppeler@xxxxxxxxxxxx>
>>>>>
>>>> Thanks for your testing!
>>>>
>>>> However, I have issues reproducing.
>>>> I run bare metal (without virtme) with v6 + your pktgen patch
>>>> and I am on the branch pktgen-and-benchmark, commit
>>>> "results: add veth-bql measurements":
>>>>
>>>> 1. ping fails with 100% packet loss ~20% of the times with --pktgen.
>>>> When this happens the avg ping of this run is mistakenly set
>>>> to 0.0 ms, which distorts the results.
>>>> I fixed it locally by rerunning when this happens.
>>>>
>>>> 2. pktgen runs with > 3 Mpps even with --nrules 10000, see log below.
>>>> I see that this is because of qdisc drops.
>>>> I also tried pfifo and sfq but with the same result.
>>>> I spent quite some time on it but I do not know a fix.
>>>>
>>>> Do you have an idea?
>>>> Thanks!
>>> Hi,
>>> yes there are some changes missing in the test script.
>>> I have pushed it now, sorry. This should fix 1.
>> I pulled it and ran...
>>
>> sudo ./veth_bql_sweep.sh --runs 1 --pktgen --duration 20 --qdisc fq_codel --no-bpftrace
>>
>> ... but still 8/32=1/4 of the pings are zero, I do not see
>> a pattern.
>>
>>
>> I grabbed the logs from /tmp and this is what a failing
>> ping looks like:
>>
>> PING 10.99.0.2 (10.99.0.2) 56(84) bytes of data.
>>
>> --- 10.99.0.2 ping statistics ---
>> 97 packets transmitted, 0 received, 100% packet loss, time 19967ms
>>
>>
>> Feels like a race or something..
>> Can you reproduce with the exact command?
>> I think you need --runs 1, else it just averages over multiple
>> runs.
>
> Sorry, no I could not reproduce this. I used the exact same
> command as you did, and I am using net-next/main + v6 patches.
> I have 0% ping loss across all tests. Does the ping loss
> happen regardless of the qdisc?
>

Yes, it happens for each qdisc I tested.
As a fix I changed the script to rerun if this happens.

With that I ran the benchmark and also created a script to have
the result as an ASCII table.
I think it would make sense to include something like this in
the commit message.

Throughput (pps)
==================================================
nrules | 0us | 100us | 1000us | 10000us || stock
-------+-------+-------+--------+---------++------
0 | 1.65M | 1.75M | 1.74M | 1.74M || 1.73M
100 | 684K | 755K | 730K | 728K || 744K
1000 | 119K | 126K | 126K | 125K || 126K
10000 | 13K | 12K | 13K | 13K || 13K


Ping RTT ms (avg)
==================================================
nrules | 0us | 100us | 1000us | 10000us || stock
-------+-------+-------+--------+---------++------
0 | 0.016 | 0.138 | 0.137 | 0.135 || 0.133
100 | 0.029 | 0.185 | 0.310 | 0.315 || 0.310
1000 | 0.137 | 0.321 | 1.66 | 1.81 || 1.78
10000 | 1.22 | 1.87 | 3.02 | 16.0 || 17.2

>>
>>> Regarding 2.: do not look at the pktgen output, in the
>>> new version you will see something like "goodput",
>>> which is the number you should look for.
>>> Pktgen will report at what speed it enqueued packets in
>>> the qdisc.
>> Exactly. Now it works. Had a single outlier but apart from that
>> everything is fine.
>>
>> Thanks,
>> Simon
>>
>>> Let me know if it worked.
>>> Best,
>>> Jonas
>>>