Re: [PATCH net] net: ibm: emac: mal: fix potential system hang in mal_remove()

From: Jacob Keller

Date: Thu Jun 04 2026 - 14:53:06 EST


On 6/3/2026 4:08 PM, Rosen Penev wrote:
> napi_disable() is not idempotent and calling it on an already-disabled
> or unenabled NAPI context will cause the kernel to spin indefinitely
> waiting for the NAPI_STATE_SCHED bit to clear.
>
> In mal_remove(), napi_disable() is called unconditionally. If no MACs were
> registered, NAPI was never enabled. Also, if they were registered but
> subsequently unregistered, NAPI was already disabled in
> mal_unregister_commac(). In either case, calling napi_disable() causes
> the kernel to hang upon module removal.
>
> Fix this by only calling napi_disable() in mal_remove() if the commac list
> is not empty (which implies NAPI is enabled).
>
> Fixes: 59e90b2d2250 ("ibm_emac: Convert to use napi_struct independent of struct net_device")
> Assisted-by: antigravity:gemini-3.5-flash
> Signed-off-by: Rosen Penev <rosenp@xxxxxxxxx>
> ---
> drivers/net/ethernet/ibm/emac/mal.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/ibm/emac/mal.c b/drivers/net/ethernet/ibm/emac/mal.c
> index 83dd7f99d8d5..74526002d52b 100644
> --- a/drivers/net/ethernet/ibm/emac/mal.c
> +++ b/drivers/net/ethernet/ibm/emac/mal.c
> @@ -712,13 +712,13 @@ static void mal_remove(struct platform_device *ofdev)
> MAL_DBG(mal, "remove" NL);
>
> /* Synchronize with scheduled polling */
> - napi_disable(&mal->napi);
> -
> - if (!list_empty(&mal->list))
> + if (!list_empty(&mal->list)) {
> + napi_disable(&mal->napi);
> /* This is *very* bad */
> WARN(1, KERN_EMERG
> "mal%d: commac list is not empty on remove!\n",
> mal->index);

This one doesn't make sense to me. The list_empty check does a WARN()
indicating that this is not supposed to happen.

This implies that list_empty should be true, otherwise we'd see a WARN
every time mal_remove is called.

But in that case, we'd have been calling napi_disable incorrectly in
most cases where it was previously unsafe according to your claim.

At best, this list_empty check is the wrong way to tell if the napi is
disabled, at worst, this whole change is pointless.
> + }
>
> mal_reset(mal);
>
> --
> 2.54.0
>
>