[REPORT] net/ncsi: possible RCU use-after-free in ncsi_vlan_rx_kill_vid()
From: Qihang
Date: Thu May 14 2026 - 22:49:07 EST
Hi netdev maintainers,
I would like to report a potential RCU use-after-free in the NCSI VLAN
handling path.
Summary
-------
In net/ncsi/ncsi-manage.c, ncsi_vlan_rx_kill_vid() removes an entry from
ndp->vlan_vids using list_del_rcu(), then immediately frees it with kfree().
A concurrent reader in set_one_vid() traverses the same list under
rcu_read_lock() and dereferences vlan->vid.
This appears to violate RCU lifetime rules: list_del_rcu() requires deferring
free until after a grace period (e.g. kfree_rcu()/call_rcu()).
Affected code
-------------
Reader side:
net/ncsi/ncsi-manage.c:set_one_vid()
rcu_read_lock();
list_for_each_entry_rcu(vlan, &ndp->vlan_vids, list) {
vid = vlan->vid;
...
}
rcu_read_unlock();
Writer side:
net/ncsi/ncsi-manage.c:ncsi_vlan_rx_kill_vid()
list_for_each_entry_safe(vlan, tmp, &ndp->vlan_vids, list)
if (vlan->vid == vid) {
list_del_rcu(&vlan->list);
kfree(vlan);
}
Why this looks unsafe
---------------------
- Reader and writer do not share a common lock for ndp->vlan_vids lifetime.
- Reader uses RCU traversal.
- Writer removes with list_del_rcu() but frees immediately with kfree().
- No synchronize_rcu()/call_rcu()/kfree_rcu() is visible before free.
Privilege / trigger model
-------------------------
The write-side trigger (VLAN deletion reaching ncsi_vlan_rx_kill_vid()) is
typically restricted to CAP_NET_ADMIN (in the relevant network namespace).
The read side runs asynchronously in kernel workqueue context.
Therefore, this is not a pure unprivileged trigger: a privileged actor (or a
delegated container admin with CAP_NET_ADMIN) is generally required to drive
the write-side race window.
Impact
------
Potential UAF read on struct vlan_vid during concurrent VLAN deletion and
NCSI workqueue configuration path. At minimum this is a kernel memory-safety
violation; practical exploitability depends on timing and allocator reuse.
Environment / reproduction status
---------------------------------
I validated this as a code-level issue on Linux 7.1.0-rc3 source.
I was not able to reliably reproduce a KASAN crash in my QEMU virtio-net setup,
likely because NCSI runtime path is hardware/driver dependent in this
environment.
Suggested fix direction
-----------------------
Use RCU-delayed free for vlan entries removed from ndp->vlan_vids, e.g.
kfree_rcu(vlan, rcu) (or call_rcu()) and ensure struct vlan_vid carries the
required deferred-reclamation field.
Thanks,
Qihang