Re: [PATCH net-next V2 4/8] net/mlx5: Switch vport HCA cap helpers to kvzalloc

From: Moshe Shemesh

Date: Tue May 19 2026 - 11:56:57 EST




On 5/18/2026 10:13 AM, Tariq Toukan wrote:
From: Moshe Shemesh <moshe@xxxxxxxxxx>

mlx5_vport_set_other_func_cap() and mlx5_vport_get_vhca_id() allocate
command buffers that embed the HCA capability union, exceeding 4KiB.
Use kvzalloc/kvfree so the allocation can fall back to vmalloc when
contiguous memory is scarce.

Signed-off-by: Moshe Shemesh <moshe@xxxxxxxxxx>
Signed-off-by: Tariq Toukan <tariqt@xxxxxxxxxx>
---
drivers/net/ethernet/mellanox/mlx5/core/vport.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/vport.c b/drivers/net/ethernet/mellanox/mlx5/core/vport.c
index 4effe37fd455..f8e6b1ab7c5c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/vport.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vport.c
@@ -1336,7 +1336,7 @@ int mlx5_vport_get_vhca_id(struct mlx5_core_dev *dev, u16 vport, u16 *vhca_id)
if (mlx5_esw_vport_vhca_id(dev->priv.eswitch, vport, vhca_id))
return 0;
- query_ctx = kzalloc(query_out_sz, GFP_KERNEL);
+ query_ctx = kvzalloc(query_out_sz, GFP_KERNEL);
if (!query_ctx)
return -ENOMEM;

Sashiko.dev says:
"
The commit message notes that these structures exceed 4KiB and are vulnerable to allocation failures under memory fragmentation. Are there other allocations of these same structures that were missed?
For instance, mlx5_core_get_caps_mode() in main.c allocates query_hca_cap_out with kzalloc(). Since this is on the initialization path for every mlx5 device (including dynamically created SR-IOV VFs), could this fail under fragmentation and prevent VF creation?
Other locations that still appear to use kzalloc() for these >4KiB structures include devlink configuration paths like mlx5_devlink_port_fn_migratable_set() and mlx5_devlink_port_fn_roce_set() in eswitch_offloads.c and eswitch.c.
Should these be converted to kvzalloc() as well to prevent similar allocation failures?
"

As it says, there are more places in the driver, this patch handles it for the vport. I will send cleanup patches for others.

@@ -1348,7 +1348,7 @@ int mlx5_vport_get_vhca_id(struct mlx5_core_dev *dev, u16 vport, u16 *vhca_id)
*vhca_id = MLX5_GET(cmd_hca_cap, hca_caps, vhca_id);
out_free:
- kfree(query_ctx);
+ kvfree(query_ctx);
return err;
}
EXPORT_SYMBOL_GPL(mlx5_vport_get_vhca_id);
@@ -1363,7 +1363,7 @@ int mlx5_vport_set_other_func_cap(struct mlx5_core_dev *dev, const void *hca_cap
void *set_ctx;
int ret;
- set_ctx = kzalloc(set_sz, GFP_KERNEL);
+ set_ctx = kvzalloc(set_sz, GFP_KERNEL);
if (!set_ctx)
return -ENOMEM;
@@ -1392,6 +1392,6 @@ int mlx5_vport_set_other_func_cap(struct mlx5_core_dev *dev, const void *hca_cap
MLX5_SET(set_hca_cap_in, set_ctx, function_id, function_id);
ret = mlx5_cmd_exec_in(dev, set_hca_cap, set_ctx);
- kfree(set_ctx);
+ kvfree(set_ctx);
return ret;
}