Re: [PATCH net-next v6 5/5] veth: time-based BQL completion coalescing via ethtool tx-usecs

From: Simon Schippers

Date: Mon Jun 01 2026 - 08:00:59 EST


On 5/28/26 09:46, Jonas Köppeler wrote:
> On 5/27/26 3:54 PM, hawk@xxxxxxxxxx wrote:
>> From: Simon Schippers <simon.schippers@xxxxxxxxxxxxxx>
>>
>> Per-packet BQL completion forces DQL to converge on limit=2, causing
>> excessive NAPI scheduling overhead and qdisc requeues.
>>
>> Accumulate BQL completions and flush them when a configurable time
>> threshold is exceeded, letting DQL discover a limit that bounds actual
>> queuing delay to the configured interval. Coalescing state persists
>> across NAPI polls in struct veth_rq so completions can accumulate
>> beyond a single budget=64 cycle.
>>
>> Add ethtool tx-usecs support for runtime tuning. Default is 100 us;
>> setting tx-usecs to 0 disables coalescing and falls back to per-packet
>> completion.
>>
>> ethtool -C <veth-dev> tx-usecs 500 # 500us coalescing
>> ethtool -C <veth-dev> tx-usecs 0 # per-packet (no coalescing)
>>
>> Co-developed-by: Jesper Dangaard Brouer <hawk@xxxxxxxxxx>
>> Signed-off-by: Jesper Dangaard Brouer <hawk@xxxxxxxxxx>
>> Signed-off-by: Simon Schippers <simon.schippers@xxxxxxxxxxxxxx>
>
> Tested-by: Jonas Köppeler<j.koeppeler@xxxxxxxxxxxx>
>

Thanks for your testing!

However, I have issues reproducing.
I run bare metal (without virtme) with v6 + your pktgen patch
and I am on the branch pktgen-and-benchmark, commit
"results: add veth-bql measurements":

1. ping fails with 100% packet loss ~20% of the times with --pktgen.
When this happens the avg ping of this run is mistakenly set
to 0.0 ms, which distorts the results.
I fixed it locally by rerunning when this happens.

2. pktgen runs with > 3 Mpps even with --nrules 10000, see log below.
I see that this is because of qdisc drops.
I also tried pfifo and sfq but with the same result.
I spent quite some time on it but I do not know a fix.

Do you have an idea?
Thanks!


The raw log:

sudo ./veth_bql_test.sh --pktgen --duration 2 --qdisc fq_codel --no-bpftrace --tx-usecs 100 --nrules 10000
INFO: Setting up veth pair with GRO
INFO: Threaded NAPI enabled
INFO: Installing qdisc: fq_codel
INFO: Loaded 10000 iptables rules in consumer NS
INFO: kernel: 7.1.0-rc4-patched-20260307+
INFO: BQL sysfs found: /sys/class/net/veth_bql0/queues/tx-0/byte_queue_limits
INFO: ethtool tx-usecs set to 100 on veth_bql1 (rx side)
INFO: Starting ping to 10.99.0.2 (5/s) to measure latency under load
INFO: Starting pktgen queue_xmit on veth_bql0 (threads=1 pkt_size=64)
[5s] BQL inflight=1 limit=17 watchdog=0
[5s] qdisc fq_codel pkts=27417 drops=6591520 requeues=14115 backlog=0 qlen=0 overlimits=0
[5s] softnet: processed=27960 time_squeeze=0 multi-CPU(6): cpu0(+5) cpu1(+121) cpu2(+33) cpu3(+116) cpu4(+27641) cpu5(+44)
INFO: Ping loss: 0% packet loss
INFO: Ping summary: rtt min/avg/max/mdev = 0.127/1.818/2.703/0.761 ms
INFO: pktgen results (thread 0):
Params: count 0 min_pkt_size: 64 max_pkt_size: 64
frags: 0 delay: 0 clone_skb: 0 ifname: veth_bql0@0
flows: 0 flowlen: 0
queue_map_min: 0 queue_map_max: 0
dst_min: 10.99.0.2 dst_max:
src_min: src_max:
src_mac: 0e:aa:4e:05:95:89 dst_mac: c2:e5:a3:4c:2a:7f
udp_src_min: 9 udp_src_max: 9 udp_dst_min: 9999 udp_dst_max: 9999
src_mac_count: 0 dst_mac_count: 0
xmit_mode: xmit_queue
Flags: NO_TIMESTAMP QUEUE_MAP_CPU SHARED
Current:
pkts-sofar: 6516031 errors: 102854
started: 12246566004us stopped: 12248535234us idle: 0us
seq_num: 6516032 cur_dst_mac_offset: 0 cur_src_mac_offset: 0
cur_saddr: 10.99.0.1 cur_daddr: 10.99.0.2
cur_udp_dst: 9999 cur_udp_src: 9
cur_queue_map: 0
flows: 0
Result: OK: 1969229(c1969229+d0) usec, 6516031 (64byte,0frags)
3308924pps 1694Mb/sec (1694169088bps) errors: 102854
TEST: veth_bql [ OK ]
INFO: Results: /home/simon/repos/veth-backpressure-performance-testing/results/selftests/2026-06-01T13-24-51