Re: [PATCH 7/8] perf sched: Replace BUG_ON on invalid CPU with graceful skip

From: Ian Rogers

Date: Wed Jun 03 2026 - 11:35:50 EST


On Tue, Jun 2, 2026 at 4:57 PM Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
>
> From: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
>
> latency_switch_event(), latency_runtime_event(), and map_switch_event()
> use BUG_ON(cpu >= MAX_CPUS || cpu < 0) to validate the sample CPU.
> When PERF_SAMPLE_CPU is absent from the sample type,
> evsel__parse_sample() initializes sample->cpu to (u32)-1. Casting
> this to int yields -1, which triggers the BUG_ON and aborts perf sched.
>
> The central CPU validation in perf_session__deliver_event() intentionally
> preserves the (u32)-1 sentinel for downstream tools like perf script
> and perf inject, so leaf callbacks must handle it themselves.
>
> Replace the three BUG_ON calls with graceful skips using pr_warning(),
> matching the existing pattern in process_sched_switch_event() and
> process_sched_runtime_event() earlier in the same file. Include the
> file offset for cross-referencing with perf report -D.
>
> Reported-by: sashiko-bot@xxxxxxxxxx # Running on a local machine
> Assisted-by: Claude Opus 4.6 <noreply@xxxxxxxxxxxxx>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>

Reviewed-by: Ian Rogers <irogers@xxxxxxxxxx>

Thanks,
Ian

> ---
> tools/perf/builtin-sched.c | 22 +++++++++++++++++++---
> 1 file changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 9ec8e049e19b0038..81833d169470582b 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -1145,7 +1145,12 @@ static int latency_switch_event(struct perf_sched *sched,
> int cpu = sample->cpu, err = -1;
> s64 delta;
>
> - BUG_ON(cpu >= MAX_CPUS || cpu < 0);
> + /* perf.data is untrusted input — CPU may be absent or corrupted */
> + if (cpu >= MAX_CPUS || cpu < 0) {
> + pr_warning("WARNING: at offset %#" PRIx64 ": out-of-bound sample CPU %d, skipping sample\n",
> + sample->file_offset, cpu);
> + return 0;
> + }
>
> timestamp0 = sched->cpu_last_switched[cpu];
> sched->cpu_last_switched[cpu] = timestamp;
> @@ -1215,7 +1220,13 @@ static int latency_runtime_event(struct perf_sched *sched,
> if (thread == NULL)
> return -1;
>
> - BUG_ON(cpu >= MAX_CPUS || cpu < 0);
> + /* perf.data is untrusted input — CPU may be absent or corrupted */
> + if (cpu >= MAX_CPUS || cpu < 0) {
> + pr_warning("WARNING: at offset %#" PRIx64 ": out-of-bound sample CPU %d, skipping sample\n",
> + sample->file_offset, cpu);
> + err = 0;
> + goto out_put;
> + }
> if (!atoms) {
> if (thread_atoms_insert(sched, thread))
> goto out_put;
> @@ -1640,7 +1651,12 @@ static int map_switch_event(struct perf_sched *sched, struct perf_sample *sampl
> const char *str;
> int ret = -1;
>
> - BUG_ON(this_cpu.cpu >= MAX_CPUS || this_cpu.cpu < 0);
> + /* perf.data is untrusted input — CPU may be absent or corrupted */
> + if (this_cpu.cpu >= MAX_CPUS || this_cpu.cpu < 0) {
> + pr_warning("WARNING: at offset %#" PRIx64 ": out-of-bound sample CPU %d, skipping sample\n",
> + sample->file_offset, this_cpu.cpu);
> + return 0;
> + }
>
> if (this_cpu.cpu > sched->max_cpu.cpu)
> sched->max_cpu = this_cpu;
> --
> 2.54.0
>