Re: [net-next V4 5/5] eth: mlx5: Move pause storm errors to pause stats
From: Paolo Abeni
Date: Thu Mar 05 2026 - 06:26:28 EST
On 3/3/26 12:01 AM, Mohsin Bashir wrote:
> Report device_stall_critical_watermark_cnt as tx_pause_storm_events in
> the ethtool_pause_stats struct. This counter tracks pause storm error
> events which indicate the NIC has been sending pause frames for an
> extended period due to a stall.
>
> The ethtool_pause_stats struct reports these stalls as a single value,
> whereas the device supports tracking them per priority. Aggregate the
> counter across all priority classes to capture stalls on all priorities.
> Note that the stats are fetched from the device for each priority via
> mlx5_core_access_reg().
>
> Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx>
> Signed-off-by: Mohsin Bashir <mohsin.bashr@xxxxxxxxx>
> ---
> .../ethernet/mellanox/mlx5/core/en_stats.c | 30 +++++++++++++++++++
> 1 file changed, 30 insertions(+)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
> index a8af84fc9763..1a3ecf073913 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
> @@ -916,11 +916,30 @@ static int mlx5e_stats_get_ieee(struct mlx5_core_dev *mdev,
> sz, MLX5_REG_PPCNT, 0, 0);
> }
>
> +static int mlx5e_stats_get_per_prio(struct mlx5_core_dev *mdev,
> + u32 *ppcnt_per_prio, int prio)
> +{
> + u32 in[MLX5_ST_SZ_DW(ppcnt_reg)] = {};
> + int sz = MLX5_ST_SZ_BYTES(ppcnt_reg);
> +
> + if (!(MLX5_CAP_PCAM_FEATURE(mdev, pfcc_mask) &&
> + MLX5_CAP_DEBUG(mdev, stall_detect)))
> + return -EOPNOTSUPP;
> +
> + MLX5_SET(ppcnt_reg, in, local_port, 1);
> + MLX5_SET(ppcnt_reg, in, grp, MLX5_PER_PRIORITY_COUNTERS_GROUP);
> + MLX5_SET(ppcnt_reg, in, prio_tc, prio);
> + return mlx5_core_access_reg(mdev, in, sz, ppcnt_per_prio, sz,
> + MLX5_REG_PPCNT, 0, 0);
> +}
> +
> void mlx5e_stats_pause_get(struct mlx5e_priv *priv,
> struct ethtool_pause_stats *pause_stats)
> {
> u32 ppcnt_ieee_802_3[MLX5_ST_SZ_DW(ppcnt_reg)];
> struct mlx5_core_dev *mdev = priv->mdev;
> + u64 ps_stats = 0;
> + int prio;
>
> if (mlx5e_stats_get_ieee(mdev, ppcnt_ieee_802_3))
> return;
> @@ -933,6 +952,17 @@ void mlx5e_stats_pause_get(struct mlx5e_priv *priv,
> MLX5E_READ_CTR64_BE_F(ppcnt_ieee_802_3,
> eth_802_3_cntrs_grp_data_layout,
> a_pause_mac_ctrl_frames_received);
> +
> + for (prio = 0; prio < NUM_PPORT_PRIO; prio++) {
> + if (mlx5e_stats_get_per_prio(mdev, ppcnt_ieee_802_3, prio))
> + return;
> +
> + ps_stats += MLX5E_READ_CTR64_BE_F(ppcnt_ieee_802_3,
> + eth_per_prio_grp_data_layout,
> + device_stall_critical_watermark_cnt);
> + }
> +
> + pause_stats->tx_pause_storm_events = ps_stats;
> }
>
> void mlx5e_stats_eth_phy_get(struct mlx5e_priv *priv,
It would be nice if someone at nVidia could have a look at the mlx bits
here. Waiting a little longer for that.
Thanks,
Paolo