Re: [PATCH v12 1/2] IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier
From: Jason Gunthorpe
Date: Tue Jun 02 2026 - 20:29:52 EST
On Sun, May 10, 2026 at 11:22:53PM +0100, Prathamesh Deshpande wrote:
> mlx5_ib_alloc_transport_domain() allocates a transport domain and then
> may fail in mlx5_ib_enable_lb(). In that case, the allocated TD is leaked.
>
> Fix this by deallocating the TD when mlx5_ib_enable_lb() returns an
> error. Also return 0 explicitly in the no-loopback-capability success
> branch, and move dev->lb.mutex initialization to mlx5_ib_stage_init_init().
>
> Destroy dev->lb.mutex in the matching cleanup path and in init failure
> paths after the mutex is initialized.
To many things in one patch
> @@ -2068,9 +2068,13 @@ static int mlx5_ib_alloc_transport_domain(struct mlx5_ib_dev *dev, u32 *tdn,
> if ((MLX5_CAP_GEN(dev->mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH) ||
> (!MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) &&
> !MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
> - return err;
> + return 0;
> +
> + err = mlx5_ib_enable_lb(dev, true, false);
> + if (err)
> + mlx5_cmd_dealloc_transport_domain(dev->mdev, *tdn, uid);
>
> - return mlx5_ib_enable_lb(dev, true, false);
> + return err;
> }
This seems like it might be reasonable
But mutex_destroy is only a debugging feature, we don't need to rework
code to carefully destroy it in error paths.. It should be init'd when
the memory is allocated and free'd when the memory is destroyed.
> - if ((MLX5_CAP_GEN(dev->mdev, port_type) == MLX5_CAP_PORT_TYPE_ETH) &&
> - (MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) ||
> - MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
> - mutex_init(&dev->lb.mutex);
Though I am wondering why someone did this..
Jason