[PATCH rdma-next 07/10] RDMA/mlx5: Fix UAF in DCT destroy due to race with create
From: Edward Srouji
Date: Wed Mar 25 2026 - 15:07:30 EST
A potential race condition exists between mlx5_core_destroy_dct() and
mlx5_core_create_dct() that can lead to a use-after-free.
After _mlx5_core_destroy_dct() releases the DCT to firmware, the DCTN
can be immediately reallocated for a new DCT being created concurrently.
If the create path stores the new DCT in the xarray before the destroy path
erases it, the destroy will incorrectly delete the new DCT's entry.
Later accesses then hit freed memory.
Fix by replacing the unconditional xa_erase_irq() with xa_cmpxchg_irq()
that only erases the entry if it hasn't already been replaced (still
contains XA_ZERO_ENTRY), preserving any newly created DCT.
Fixes: afff24899846 ("RDMA/mlx5: Handle DCT QP logic separately from low level QP interface")
Signed-off-by: Edward Srouji <edwards@xxxxxxxxxx>
Reviewed-by: Michael Guralnik <michaelgur@xxxxxxxxxx>
---
drivers/infiniband/hw/mlx5/qpc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/mlx5/qpc.c b/drivers/infiniband/hw/mlx5/qpc.c
index 146d03ae40bd9fd9650530fba77eb7e942d5fe79..a7a4f9420271a228e161aaac1ffa432d304ce431 100644
--- a/drivers/infiniband/hw/mlx5/qpc.c
+++ b/drivers/infiniband/hw/mlx5/qpc.c
@@ -314,7 +314,14 @@ int mlx5_core_destroy_dct(struct mlx5_ib_dev *dev,
xa_cmpxchg_irq(&table->dct_xa, dct->mqp.qpn, XA_ZERO_ENTRY, dct, 0);
return err;
}
- xa_erase_irq(&table->dct_xa, dct->mqp.qpn);
+
+ /*
+ * A race can occur where a concurrent create gets the same dctn
+ * (after hardware released it) and overwrites XA_ZERO_ENTRY with
+ * its new DCT before we reach here. In that case, we must not erase
+ * the entry as it now belongs to the new DCT.
+ */
+ xa_cmpxchg_irq(&table->dct_xa, dct->mqp.qpn, XA_ZERO_ENTRY, NULL, 0);
return 0;
}
--
2.49.0