Re: [PATCH] bpf: Simplify tnum_step()

From: Hao Sun

Date: Thu Mar 19 2026 - 15:41:36 EST

On Thu, Mar 19, 2026 at 6:38 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> On Thu, 2026-03-19 at 10:06 +0100, Hao Sun wrote:
> > On Thu, Mar 19, 2026 at 9:18 AM Kumar Kartikeya Dwivedi
> > <memxor@xxxxxxxxx> wrote:
> > >
> > > On Wed, 18 Mar 2026 at 19:21, Hao Sun <sunhao.th@xxxxxxxxx> wrote:
> > > > [...]
> > > > in case anyone is interested:
> > > > [1] https://pastebin.com/raw/czHKiyY0
> > >
> > > IMO it is worth it to include this proof inline in the commit log,
> > > since links are fragile.
> > > It's not that big, and I think it's more useful to have it inline than not.
> > >
> >
> > The only concern is that the proof mainly uses `bv_decide`, which does not
> > provide much insight. But it's not big, I will inline it.
>
> I agree with you, not sure it would provide much signal, tbh.
> As far as I understand `bv_decide` means: SAT-solver, please do the magic :)
>
> A more interesting discussion would be to have some model-checker
> based tests in the selftests, but Alexei was not excited last time we
> talked about that.
>
> If we have enough interested people, we can pick a checker and
> maintain a "shadow copy" of relevant functions and data structures in
> some repo e.g. on github + current proofs/tests. To have a starting
> point for future changes.

I actually have an early prototype for this (developed a few months ago):
https://github.com/SunHao-0/eBPF_verification

The tool extracts the **unmodified** functions relevant to range analysis from
`verifier.c` and `tnum.c` into a self-contained `range_analysis.c`
file (no kernel
dependencies). The extraction is done with a simple approach: maintaining
a list of functions of interest, then extracting by lines. The
extracted code can
then be verified independently.

The specification is defined in `verifier_spec.c` linked with
`range_analysis.c`,
cmbc and bitwuzla are used for bounded model checking. The workflow is
simple enough for the original goal.

But the solver later became a bottleneck as the target function grows larger.
This required more modular verification, breaking down the verification
granularity. Nevertheless, it's a practical setup.

I am recently more interested in a Lean4-based solution: we can embed a
very small subset of C (large enough for the verifier's algorithm) as DSL
into Lean4, so that we can still use the same code. Then we can define
the spec for the function (similar to the theorem provided in the link), and
prove the soundness (even other interesting properties) using Lean4
tractic mode, which is more powerful (and becomes necessary once
the property is nontrivial). AI can certainly help with writing the proof:).
Future changes can then go through the proof; one interesting point
is that we only need to review the spec, Lean4 checks the proof, and
the trust here is Lean4's kernel, relatively small and robust.

If either of the above is of interest, we can certainly develop and maintain.