Re: [PATCH v2 0/6] alloc_tag: introduce IOCTL-based filtering for MAP
From: Suren Baghdasaryan
Date: Wed Jun 03 2026 - 16:02:12 EST
On Fri, May 22, 2026 at 1:11 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Fri, 22 May 2026 17:45:32 +0000 Abhishek Bapat <abhishekbapat@xxxxxxxxxx> wrote:
>
> > Currently, memory allocation profiling data is primarily exposed through
> > /proc/allocinfo. While useful for manual inspection, this text-based
> > interface poses challenges for production monitoring and large-scale
> > analysis:
> >
> > 1. Userspace must parse large amounts of text to extract specific
> > fields.
> > 2. To find specific tags, userspace must read the entire dataset,
> > requiring many context switches and high data copying.
> > 3. The kernel currently aggregates per-CPU counters for every allocation
> > size, even those the user intends to filter out immediately.
> >
> > This series introduces a new IOCTL-based binary interface for allocinfo
> > that supports kernel-side filtering. By allowing the user to specify a
> > filter mask, we significantly reduce the work performed in-kernel and
> > the amount of data transferred to userspace.
> >
> > Performance measurements were conducted on an Intel Xeon Platinum 8481C
> > (224 CPUs) with caches dropped before each run.
> >
> > The IOCTL mechanism shows a ~20x performance improvement for
> > filtered queries. The kernel avoids the expensive per-CPU counter
> > aggregation (alloc_tag_read) for any tags that fail the initial string
> > or location filters.
> >
> > Scenario 1: Specific File Filtering (arch/x86/events/rapl.c)
> > 1. Traditional (cat /proc/allocinfo | grep): 22ms (sys)
> > 2. IOCTL Interface: 1ms (sys)
> >
> > Scenario 2: Compound Filtering (Filename + Size)
> > 1. Traditional: (cat ... | grep | awk): 21ms (sys)
> > 2. IOCTL Interface: 1ms (sys)
> >
> > Scenario 3: Size-Based Filtering (min_size = 1MB)
> > 1. Traditional: (cat ... | awk): 21ms (sys)
> > 2. IOCTL Interface: 14ms (sys)
>
> Yup, textual interfaces aren't fast.
>
> And ioctl-baed interfaces aren't popular. One would prefer to see an
> interface which uses read()/lseek(), pread(), etc. It would be
> appropriate for this [0/N] to have a discussion of why that approach
> was not chosen.
We chose ioctl because it allows us to filter data without aggregating
the per-CPU counters, which is the main overhead when reading this
file. That's why we can achieve 20x performance improvement, provided
we do not filter based on the allocation size.
Aside from that, I plan on introducing an additional ioctl command to
enable context capture for specific allocations.
>
> > .../userspace-api/ioctl/ioctl-number.rst | 2 +
> > MAINTAINERS | 2 +
> > include/linux/codetag.h | 1 +
> > include/uapi/linux/alloc_tag.h | 87 +++
> > lib/alloc_tag.c | 303 ++++++++++-
> > lib/codetag.c | 11 +
> > tools/testing/selftests/alloc_tag/Makefile | 9 +
> > .../alloc_tag/allocinfo_ioctl_test.c | 505 ++++++++++++++++++
> > 8 files changed, 918 insertions(+), 2 deletions(-)
> > create mode 100644 include/uapi/linux/alloc_tag.h
> > create mode 100644 tools/testing/selftests/alloc_tag/Makefile
> > create mode 100644 tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c
>
> At some point this should grow user-facing documentation, please.
>
> And the right time for that is now, because such documentation is
> useful for code review - it makes that review both easier and more
> useful.
Ack. I believe Abhishek is working on that.
>
> Sashiko had a few things to say:
>
> https://sashiko.dev/#/patchset/cover.1779471082.git.abhishekbapat@xxxxxxxxxx
Ack.