Re: [PATCH v3] libbpf: harden parse_vma_segs() path parsing
From: Emil Tsalapatis
Date: Fri May 22 2026 - 14:04:04 EST
On Fri May 22, 2026 at 8:56 AM EDT, Michael Bommarito wrote:
> parse_vma_segs() in tools/lib/bpf/usdt.c parses /proc/<pid>/maps
> with two widthless scansets, "%s" into mode[16] and "%[^\n]"
> into line[PATH_MAX]. Both assume the kernel caps maps records to
> PATH_MAX; it does not.
>
> show_map_vma() emits the path via seq_path() against the seq buffer,
> which doubles on overflow (m->size <<= 1 in fs/seq_file.c), so a VMA
> whose backing path is a deeply nested directory tree produces a single
> maps record longer than PATH_MAX. scanf "%s" / "%[^\n]" without a
> width writes until the field terminator regardless of destination size,
> so a bpf_program__attach_usdt() consumer attaching against an
> attacker-controlled PID overflows its own stack inside parse_vma_segs().
>
> Bound both scansets to the declared buffer sizes ("%15s" for mode[16]
> and "%4095[^\n]" for line[PATH_MAX]) and drain any residue past
> line[4094] with "%*[^\n]" before the trailing "\n", matching the
> libbpf-local fscanf style. Without the drain the residue of an
> over-long record would stay in the stream and break the next "%zx-%zx"
> parse, so the loop would exit early and any maps records after the
> over-long entry would be silently skipped.
>
> Also stop using sscanf(..., "%s") to peel the /proc/<pid>/root prefix
> from lib_path. Build the exact prefix for the requested PID with
> snprintf(), check it directly, and copy the remainder with
> libbpf_strlcpy(). That removes a second unbounded stack write and
> preserves paths containing spaces.
>
> Fixes: 74cc6311cec9 ("libbpf: Add USDT notes parsing and resolution logic")
> Cc: stable@xxxxxxxxxxxxxxx
> Assisted-by: Claude:claude-opus-4-7
> Signed-off-by: Michael Bommarito <michael.bommarito@xxxxxxxxx>
Reviewed-by: Emil Tsalapatis <emil@xxxxxxxxxxxxxxx>
> ---
> v3:
> - Correct Fixes tag to the initial USDT implementation commit,
> per BPF CI review after adding second site.
>
> v2:
> - Replace the unbounded /proc/<pid>/root sscanf() path peeling with
> snprintf() + prefix check + libbpf_strlcpy(), addressing review
> feedback on v1 and preserving paths containing spaces.
> - Keep the v1 maps parser fix using bounded fscanf() scansets and a
> suppressed scanset drain for over-long records.
> - Re-ran real parse_vma_segs() ASAN harnesses for the original maps
> overflow, the proc-root overflow, proc-root paths with spaces, and
> adjacent successful parses after an over-long maps record.
>
> Reproduced with Debian 12 on rootless podman: an unprivileged
> container process mkdirs 50 nested 200-char directories and mmaps
> a file at the bottom, producing a 10403-byte /proc/<host-pid>/maps
> line. A harness on the host then calls the real parse_vma_segs()
> against the container's PID; libbpf is built with
> -fsanitize=address and the only local source change is dropping
> the "static" keyword on parse_vma_segs so the symbol is linkable
> from the harness.
>
> Stock libbpf reports:
>
> ==ERROR: AddressSanitizer: stack-buffer-overflow
> WRITE of size 10349 at <line> thread T0
> #0 scanf_common -> #1 __isoc99_fscanf
> #3 parse_vma_segs tools/lib/bpf/usdt.c:509
> Address ... in frame parse_vma_segs at offset 8512, just past
> line[PATH_MAX].
>
> Patched libbpf parses the same maps cleanly. Follow-up calls
> return 0 with seg_cnt > 0 for libc.so.6 and for
> ld-linux-x86-64.so.2 (format drain), which appears in maps
> after the over-long entry.
>
> On normal hardened builds the stack canary aborts the consumer;
> on builds without stack protector the bytes past line[] are
> attacker-influenced path bytes.
>
> Selftest gate
> =============
>
> tools/testing/selftests/bpf/test_progs -t usdt under QEMU x86_64
> (KVM) on the patched kernel: all 6 subtests pass (usdt/basic,
> basic_optimized, optimized_attach, multispec, urand_auto_attach,
> urand_pid_attach) on both stock and patched libbpf, diff-clean.
> The in-tree selftest does not itself exercise long maps records.
>
> tools/lib/bpf/usdt.c | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/tools/lib/bpf/usdt.c b/tools/lib/bpf/usdt.c
> index e3710933fd52a..2ed792cf11438 100644
> --- a/tools/lib/bpf/usdt.c
> +++ b/tools/lib/bpf/usdt.c
> @@ -471,7 +471,7 @@ static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs,
> char path[PATH_MAX], line[PATH_MAX], mode[16];
> size_t seg_start, seg_end, seg_off;
> struct elf_seg *seg;
> - int tmp_pid, i, err;
> + int n, i, err;
> FILE *f;
>
> *seg_cnt = 0;
> @@ -480,8 +480,13 @@ static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs,
> * /proc/<pid>/root/<path>. They will be reported as just /<path> in
> * /proc/<pid>/maps.
> */
> - if (sscanf(lib_path, "/proc/%d/root%s", &tmp_pid, path) == 2 && pid == tmp_pid)
> + n = snprintf(line, sizeof(line), "/proc/%d/root", pid);
Minor nit in case this gets respun: Can you mention that the n >= int(sizeof(line)) check also
takes care of making sure the libbpf_strlcpy below doesn't overflow path?
> + if (n < 0 || n >= (int)sizeof(line))
> + return -ENAMETOOLONG;
> + if (str_has_pfx(lib_path, line) && lib_path[n] == '/') {
> + libbpf_strlcpy(path, lib_path + n, sizeof(path));
> goto proceed;
> + }
>
> if (!realpath(lib_path, path)) {
> pr_warn("usdt: failed to get absolute path of '%s' (err %s), using path as is...\n",
> @@ -504,8 +509,11 @@ static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs,
> * 7f5c6f5d1000-7f5c6f5d3000 rw-p 001c7000 08:04 21238613 /usr/lib64/libc-2.17.so
> * 7f5c6f5d3000-7f5c6f5d8000 rw-p 00000000 00:00 0
> * 7f5c6f5d8000-7f5c6f5d9000 r-xp 00000000 103:01 362990598 /data/users/andriin/linux/tools/bpf/usdt/libhello_usdt.so
> + *
> + * Bound the writes and drain residue: maps lines can exceed
> + * PATH_MAX when seq_path() uses a larger seq buffer.
> */
> - while (fscanf(f, "%zx-%zx %s %zx %*s %*d%[^\n]\n",
> + while (fscanf(f, "%zx-%zx %15s %zx %*s %*d%4095[^\n]%*[^\n]\n",
It would be nice if we could somehow not hardcode the lengths to make it clear,
where we're deriving the values from, but unless I'm missing something there
the only way would be to snprintf the format string every single time which would
be overkill.
> &seg_start, &seg_end, mode, &seg_off, line) == 5) {
> void *tmp;
>