Re: [PATCH] virtio-net: disable delayed refill when pausing rx

From: Bui Quang Minh
Date: Wed Apr 09 2025 - 02:56:09 EST

Next message: hupu: "Re: [RFC 1/1] sched: Remove unreliable wake_cpu check in proxy_needs_return"
Previous message: Manivannan Sadhasivam: "Re: [PATCH v6 2/2] PCI: xilinx-cpm: Add support for PCIe RP PERST# signal"
In reply to: Jakub Kicinski: "Re: [PATCH] virtio-net: disable delayed refill when pausing rx"
Next in thread: Bui Quang Minh: "Re: [PATCH] virtio-net: disable delayed refill when pausing rx"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 4/8/25 21:37, Jakub Kicinski wrote:

On Tue, 8 Apr 2025 11:28:54 +0200 Paolo Abeni wrote:

When napi_disable is called on an already disabled napi, it will sleep
in napi_disable_locked while still holding the netdev_lock. As a result,
later napi_enable gets stuck too as it cannot acquire the netdev_lock.
This leads to refill_work and the pause-then-resume tx are stuck altogether.

This needs to be added to the chagelog. And it looks like this is a fix for

commit 413f0271f3966e0c73d4937963f19335af19e628
Author: Jakub Kicinski <kuba@xxxxxxxxxx>
Date: Tue Jan 14 19:53:14 2025 -0800

net: protect NAPI enablement with netdev_lock()

?

I wonder if it's simpler to just hold the netdev lock in resize or xsk
binding instead of this.

Setting:

dev->request_ops_lock = true;

in virtnet_probe() before calling register_netdevice() should achieve
the above. Could you please have a try?

Can we do that or do we need a more tailored fix? request_ops_lock only
appeared in 6.15 and the bug AFAIU dates back to 6.14. We don't normally
worry about given the stream of fixes - request_ops_lock is a bit risky.
Jason's suggestion AFAIU is just to wrap the disable/enable pairs in
the lock. We can try request_ops_lock in -next ?

Bui Quang Minh, could you add a selftest for this problem?
See tools/testing/selftests/drivers/net/virtio_net/
You can re-use / extend the XSK helper from
tools/testing/selftests/drivers/net/xdp_helper.c ?
(move it to tools/testing/selftests/net/lib for easier access)

I've just tried the current selftests and found out that the queues.py can trigger the deadlock.

Running this a few times (the probability is quite high) on the guest with virtio-net interface
~ NETIF=enp0s2 ./ksft-net-drv/run_kselftest.sh -t drivers/net:queues.py

The test hangs.

I've got the hung task warning

[ 363.841549] #0: ffff88800015c758 ((wq_completion)events){+.+.}-{0:0}, at: process_scheduled_works+0x1a9/0x3c9
[ 363.841560] #1: ffffc90000043e28 ((work_completion)(&pool->work)){+.+.}-{0:0}, at: process_scheduled_works+0x1c2/0x3c9
[ 363.841570] #2: ffffffff81f5a150 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock+0x1b/0x1d
[ 363.841579] #3: ffff8880035e49f0 (&dev->lock){+.+.}-{4:4}, at: netdev_lock+0x12/0x14
[ 363.841587] 3 locks held by kworker/2:0/26:
[ 363.841589] #0: ffff88800015c758 ((wq_completion)events){+.+.}-{0:0}, at: process_scheduled_works+0x1a9/0x3c9
[ 363.841598] #1: ffffc900000f7e28 ((work_completion)(&(&vi->refill)->work)){+.+.}-{0:0}, at: process_scheduled_works+0x1c2/0x3c9
[ 363.841608] #2: ffff8880035e49f0 (&dev->lock){+.+.}-{4:4}, at: netdev_lock+0x12/0x14

~ cat /proc/7/stack
[<0>] netdev_lock+0x12/0x14
[<0>] napi_enable+0x1a/0x35
[<0>] virtnet_napi_do_enable+0x1a/0x40
[<0>] virtnet_napi_enable+0x32/0x4a
[<0>] virtnet_rx_resume+0x4f/0x56
[<0>] virtnet_rq_bind_xsk_pool+0xd8/0xfd
[<0>] virtnet_xdp+0x6d3/0x72d
[<0>] xp_disable_drv_zc+0x5a/0x63
[<0>] xp_clear_dev+0x1d/0x4a
[<0>] xp_release_deferred+0x24/0x75
[<0>] process_scheduled_works+0x23e/0x3c9
[<0>] worker_thread+0x13f/0x1ca
[<0>] kthread+0x1b2/0x1ba
[<0>] ret_from_fork+0x2b/0x40
[<0>] ret_from_fork_asm+0x11/0x20

~ cat /proc/26/stack
[<0>] napi_disable_locked+0x41/0xaf
[<0>] napi_disable+0x22/0x35
[<0>] refill_work+0x4b/0x8f
[<0>] process_scheduled_works+0x23e/0x3c9
[<0>] worker_thread+0x13f/0x1ca
[<0>] kthread+0x1b2/0x1ba
[<0>] ret_from_fork+0x2b/0x40
[<0>] ret_from_fork_asm+0x11/0x20

The deadlock was caused by a race between xsk unbind and refill_work.

Thanks,
Quang Minh.

Next message: hupu: "Re: [RFC 1/1] sched: Remove unreliable wake_cpu check in proxy_needs_return"
Previous message: Manivannan Sadhasivam: "Re: [PATCH v6 2/2] PCI: xilinx-cpm: Add support for PCIe RP PERST# signal"
In reply to: Jakub Kicinski: "Re: [PATCH] virtio-net: disable delayed refill when pausing rx"
Next in thread: Bui Quang Minh: "Re: [PATCH] virtio-net: disable delayed refill when pausing rx"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]