Re: [PATCH net-next v5 13/14] net: renesas: rswitch: add handler for FDB notification

From: Jakub Kicinski

Date: Mon May 25 2026 - 16:47:44 EST


This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: renesas: rswitch: add handler for FDB notification

The SWITCHDEV_FDB_ADD_TO_DEVICE notification writes the bridge MAC address
to the GWCA and adds a static rule to the MAC table.
The SWITCHDEV_FDB_DEL_TO_DEVICE removes the static rule from the MAC table.

> diff --git a/drivers/net/ethernet/renesas/rswitch_l2.c b/drivers/net/ethernet/renesas/rswitch_l2.c
> index d2deb7cb76349..4d0646283f88f 100644
> --- a/drivers/net/ethernet/renesas/rswitch_l2.c
> +++ b/drivers/net/ethernet/renesas/rswitch_l2.c
[ ... ]
> @@ -82,8 +83,9 @@ static void rswitch_update_l2_hw_forwarding(struct rswitch_private *priv)
> *
> * Do not allow L2 forwarding to self for hw port.
> */
> - iowrite32(FIELD_PREP(FWPC2_LTWFW, fwd_mask | BIT(rdev->port)),
> - priv->addr + FWPC2(rdev->port));
> + rswitch_modify(priv->addr, FWPC2(rdev->port),
> + FIELD_PREP(FWPC2_LTWFW, ~(fwd_mask | BIT(rdev->port))),
> + 0);

[Severity: High]
Does this bitwise logic correctly disable hardware forwarding for inactive
ports?

Since fwd_mask contains 0s for active ports and 1s for inactive ports, the
inverted mask ~(fwd_mask | BIT(rdev->port)) yields 1s for active ports and
0s for inactive ports.

This means the clear argument to rswitch_modify() will be 0 for inactive
ports. Combined with a set argument of 0, does this mean once a port is
active, it can never be disabled, potentially breaking Spanning Tree Protocol
(STP) blocking states?

[ ... ]
> @@ -285,6 +269,176 @@ static int rswitch_switchdev_blocking_event(struct notifier_block *nb,
> return notifier_from_errno(ret);
> }
>
[ ... ]
> +static void rswitch_switchdev_bridge_fdb_event_work(struct work_struct *work)
> +{
> + struct rswitch_switchdev_event_work *switchdev_work;
> + const unsigned char addr[] = {0, 0, 0, 0, 0, 0};
> + struct rswitch_device *rdev;
> + struct net_device *ndev;
> +
> + switchdev_work = container_of(work, struct rswitch_switchdev_event_work, work);
> + ndev = switchdev_work->ndev;
> +
> + rtnl_lock();
> +
> + /* Unfortunately all net_device members point to br0, there is no simple way to check
> + * if the event was triggered by a port device setting.
> + */
> + rswitch_for_all_ports(switchdev_work->priv, rdev) {
> + if (ether_addr_equal(rdev->ndev->dev_addr, switchdev_work->fdb_info.addr))
> + goto out;
> + }
> +
> + /* Handle only bridge device */
> + if (is_rdev(ndev) || !netif_is_bridge_master(ndev))
> + goto out;
> +
> + switch (switchdev_work->event) {
> + case SWITCHDEV_FDB_ADD_TO_DEVICE:
> + rswitch_gwca_write_mac_address(switchdev_work->priv, switchdev_work->fdb_info.addr);
> + rswitch_add_addr_to_mactable(switchdev_work->priv, switchdev_work->fdb_info.addr);
> + break;

[Severity: High]
Could this code process FDB events for unrelated software bridges on the
system?

While netif_is_bridge_master() checks if the device is a bridge, it doesn't
seem to verify if the bridge actually enslaves any of the switch's ports
(for example, checking if ndev == rdev->brdev).

Also, does writing the FDB address to the GWCA MAC address here via
rswitch_gwca_write_mac_address() unconditionally overwrite the switch's
Gateway MAC address with any static FDB entry?

[ ... ]
> +static int rswitch_switchdev_event(struct notifier_block *nb,
> + unsigned long event,
> + void *ptr)
> +{
[ ... ]
> + switchdev_work->fdb_info.addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
> + if (!switchdev_work->fdb_info.addr)
> + goto err_addr_alloc;
> +
> + priv->switchdev_work = switchdev_work;

[Severity: High]
Can storing the dynamically allocated work item pointer in a single
priv->switchdev_work field lead to memory leaks or use-after-free
conditions?

If multiple events are queued rapidly, this pointer appears to be overwritten,
losing track of previous work items and leaking the memory for both the work
item and the fdb_info.addr reference.

> +
> + ether_addr_copy((u8 *)switchdev_work->fdb_info.addr,
> + fdb_info->addr);
> + netdev_hold(ndev, &switchdev_work->dev_tracker, GFP_ATOMIC);
> + queue_work(system_long_wq, &switchdev_work->work);
> +
> + break;
> + }

[Severity: Medium]
Is system_long_wq the right workqueue to use here?

Since system_long_wq allows concurrent execution across different CPUs, rapid
ADD and DEL events for the same MAC address could race to acquire rtnl_lock()
in the worker.

Could this lack of strict FIFO ordering allow a DEL event to execute before
an ADD event, leaving a stale, permanent entry in the MAC table? Would a
dedicated ordered workqueue be more appropriate to preserve switchdev event
ordering?

> diff --git a/drivers/net/ethernet/renesas/rswitch_main.c b/drivers/net/ethernet/renesas/rswitch_main.c
> index cfe7bfe6546b2..6fa37533feffd 100644
> --- a/drivers/net/ethernet/renesas/rswitch_main.c
> +++ b/drivers/net/ethernet/renesas/rswitch_main.c
[ ... ]
> @@ -2262,6 +2262,8 @@ static void rswitch_deinit(struct rswitch_private *priv)
> rswitch_gwca_ts_queue_free(priv);
> rswitch_gwca_linkfix_free(priv);
>
> + cancel_work_sync(&priv->switchdev_work->work);
> +
> rswitch_clock_disable(priv);
> }

[Severity: High]
Can this cancel_work_sync() call cause a NULL pointer dereference or
use-after-free?

If no FDB events were processed, priv->switchdev_work will be NULL, causing
an immediate dereference when accessing ->work.

If an event did finish, the struct is freed by its own worker in
rswitch_switchdev_bridge_fdb_event_work(), meaning cancel_work_sync() would
operate on freed memory.

[Severity: Critical]
This is a pre-existing issue, but earlier in the same series, there appears
to be a double lock acquisition without an unlock.

Looking at rswitch_gwca_change_mode() in
drivers/net/ethernet/renesas/rswitch_main.c:

int rswitch_gwca_change_mode(...)
{
...
if (!rswitch_agent_clock_is_enabled(priv->addr, priv->gwca.index)) {
spin_lock(&priv->agent_lock);
rswitch_agent_clock_ctrl(priv->addr, priv->gwca.index, 1);
spin_lock(&priv->agent_lock);
}
...

Could this second spin_lock() cause an immediate deadlock whenever the agent
clock is enabled?