Re: [PATCH v19 net-next 2/9] octeontx2-af: npc: cn20k: debugfs enhancements

From: Ratheesh Kannoth

Date: Sun Jun 07 2026 - 22:27:07 EST


On 2026-06-05 at 12:02:38, Ratheesh Kannoth (rkannoth@xxxxxxxxxxx) wrote:
> Improve MCAM visibility and field debugging for CN20K NPC.
>
> - Extend "mcam_layout" to show enabled (+) or disabled state per entry
> so status can be verified without parsing the full "mcam_entry" dump.
> - Add "dstats" debugfs entry: for enabled MCAM indices, print hit deltas
> since the prior read by comparing hardware counters to a per-entry
> software baseline and advancing that baseline after each read (hardware
> counters are not cleared).
> - Add "mismatch" debugfs entry: lists MCAM entries that are enabled
> but not explicitly allocated, helping diagnose allocation/field issues.
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@xxxxxxxxxxx>

https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260605063245.3553861-1-rkannoth%40marvell.com says

t
>> + stats = rvu_read64(rvu, blkaddr,
>> + NPC_AF_CN20K_MCAMEX_BANKX_STAT_EXT(idx, bank));
>> + if (!stats)
>> + continue;
>> + if (stats == dstats[bank][idx])
>> + continue;
>> +
>> + if (stats < dstats[bank][idx])
>> + dstats[bank][idx] = 0;
>Can dstats[bank][idx] overflow this static array?
No. This array is allocated with Max values.
MAX_NUM_BANKS, MAX_SUBBANK_DEPTH and MAX_NUM_SUB_BANKS

>The header comment asserts that MAX_NUM_BANKS, MAX_NUM_SUB_BANKS and
>MAX_SUBBANK_DEPTH are hard limits "preventing any possibility of
>out-of-bounds access", but npc_priv_init() does not enforce that
>contract on bank_depth itself. It only checks num_banks, num_subbanks,
>and subbank_depth = bank_depth / num_subbanks.
>bank_depth maximum across SoC is 8192. We will add a check for num_subbnks as well
>as part of hardening series to net-next once this patch series is merged.
Will add a check on bank_depth in npc_priv_init() in hardening patch series to
net-next after this series is merged.

>Is the dstats baseline correct on the first read, and after an MCAM slot
>is freed and reallocated?
>The commit message describes dstats as printing "hit deltas since the
>prior read by comparing hardware counters to a per-entry software
>baseline and advancing that baseline after each read". The baseline
>dstats[bank][idx] is keyed only by physical hardware slot.
>On the first read after module load, dstats[bank][idx] is 0, so:
> delta = stats - dstats[bank][idx];
>prints the cumulative counter rather than a delta.
>When an MCAM slot is freed and reallocated to a different rule or PF
>(and the hardware counter is preserved across moves, as
>npc_defrag_move_vdx_to_free does via rvu_write64(... STAT_EXT ...,
>stats)), the retained baseline mixes counts that belonged to the
>previous owner into the delta attributed to the new PF.
>Should the baseline be cleared inside npc_cn20k_enable_mcam_entry() on
>each enable/disable transition, in addition to whatever happens on read?
>Is reading dstats meant to be destructive?
>Each read advances dstats[bank][idx] = stats, so two concurrent readers
>(for example a monitoring daemon and a human admin) race for events:
>whichever reads first advances the baseline, and the other sees zero
>deltas for entries it would otherwise have observed. Most observability
>files under debugfs/procfs/sysfs are passive snapshots so multiple tools
>can scrape them concurrently. Would a snapshot file plus userspace delta
>computation, or a separate write-to-clear sibling, fit better here?

"dstats" is for live debugging in field, where user wants to know which NPC
mcam entry is being hit. This debugfs entry helps to identify the mcam entry.
If mcam entry is allocated/deallocated/allocated again, first time 'cat'ing debugfs
entry, may show wrong delta. But next time, it will show the correct value as s/w
baseline array would have updated to the hw stats value correctly.

>> +
>> + pf = 0xFFFF;
>> + map = xa_load(&npc_priv->xa_idx2pf_map, mcam_idx);
>> + if (map)
>> + pf = xa_to_value(map);
>> +
>> + delta = stats - dstats[bank][idx];
>> +
>> + snprintf(buff, sizeof(buff), "%u\t%#04x\t%llu\n",
>> + mcam_idx, pf, delta);
>> + seq_puts(s, buff);
>> +
>> + dstats[bank][idx] = stats;
>> + }
>> + }
>> +
>> + mutex_unlock(&stats_lock);
>> + return 0;
>> +}
>> +
>> +/* "%u\t%#04x\t%llu\n" needs less than 64 characters to print */
>> +#define TOTAL_SZ (MAX_NUM_BANKS * MAX_NUM_SUB_BANKS * MAX_SUBBANK_DEPTH * 64)
>> +DEFINE_OCTEONTX2_DEBUGFS_ATTRIBUTE_WITH_SIZE(npc_mcam_dstats, TOTAL_SZ);
>Is the up-front 1 MiB allocation per open() necessary here?
>TOTAL_SZ evaluates to MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
>MAX_SUBBANK_DEPTH * 64 = 1,048,576 bytes, and single_open_size() asks
>seq_file to preallocate that buffer regardless of how much the file
>actually emits. dstats only emits lines for entries with changed
>counters, and mismatch is typically a few lines.
>The size also does not scale to the actual hardware geometry; it uses
>the MAX constants instead of npc_priv->num_banks * npc_priv->bank_depth.
>Could the standard seq_file iterator pattern (start/next/stop/show via
>DEFINE_SHOW_ATTRIBUTE) be used instead, so the buffer grows on demand.

BUffer is pre allocated to the maximum required.

>> }
>>
>> @@ -4513,10 +4527,15 @@ static int npc_priv_init(struct rvu *rvu)
>> return -EINVAL;
>> }
>>
>> - npc_priv.num_subbanks = num_subbanks;
>> -
>> subbank_depth = bank_depth / num_subbanks;
>> + if (subbank_depth > MAX_SUBBANK_DEPTH) {
>> + dev_err(rvu->dev,
>> + "Invalid subbank depth %u\n",
>> + subbank_depth);
>> + return -EINVAL;
>> + }
>>
>> + npc_priv.num_subbanks = num_subbanks;
>> npc_priv.bank_depth = bank_depth;
>> npc_priv.subbank_depth = subbank_depth;
>Tying back to the dstats overflow question above: bank_depth itself is
>not bounded here. With hardware reporting bank_depth = 8193 and
>num_subbanks = 32, integer division gives subbank_depth = 256, which
>passes this check, but bank_depth still exceeds MAX_NUM_SUB_BANKS *
>MAX_SUBBANK_DEPTH = 8192. Should there be an explicit check on
>bank_depth, or a check that bank_depth == subbank_depth * num_subbanks,
>to keep the en_map bitmap and dstats[][] within their declared sizes?
bank_depth is less than MAX_BANK_DEPTH in all Soc. WIll add a check in
next patch series (hardening series to check against bank_depth <= MAX_BANK_DEPTH)
int npc_priv_init() (During probe())