[BUG] ext4: delayed-free buddy load error reaches BUG_ON in ext4_process_freed_data
From: Yifei Chu
Date: Sun May 24 2026 - 11:15:03 EST
Hello,
Short version: I am reporting an ext4 delayed-free error-path bug found with targeted fault injection. The injected -EIO is in ext4_mb_load_buddy()’s normal error-return domain, and the injection is placed at the helper return boundary for the delayed-free caller. With that rare lower-layer failure made deterministic, ext4_free_data_in_buddy() reaches BUG_ON(err != 0) and crashes the kernel.
Tested kernel:
v7.1-rc4-640-g79bd2dded182
79bd2dded182b1d458b18e62684b7f82ffc682e5
x86_64 QEMU, KASAN config
The relevant code shape in fs/ext4/mballoc.c is:
err = ext4_mb_load_buddy(sb, entry->efd_group, &e4b);
/ we expect to find existing buddy because it’s pinned /
BUG_ON(err != 0);
The point of the injection is not to corrupt ext4 state. It only makes a plausible buddy/bitmap load failure deterministic at this caller, so the caller’s error handling can be tested. ext4_mb_load_buddy() already has normal negative error returns from metadata loading paths.
Reproducer shape:
- Mount a fresh ext4 filesystem.
- Create and fsync a 256 KiB file.
- Unlink the file.
- Call sync() to force delayed-free processing.
- The instrumentation forces ext4_mb_load_buddy() to return -EIO at the delayed-free callsite.
Two fresh image runs reproduced the same crash:
AGENT_INIT: unlink ret=0 errno=0 (Success)
AGENT_INIT: calling sync to force delayed free processing
EXT4-fs: AGENT_EXT4_FREE_DATA_BUDDY_BUGON: forcing ext4_mb_load_buddy EIO before BUG_ON
kernel BUG at fs/ext4/mballoc.c:3990!
Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
RIP: 0010:ext4_process_freed_data+0x1fe/0x510
I did a local duplicate sweep and found related older ext4_mb_load_buddy()/mballoc fixes, but I did not find a direct current-upstream fix for this delayed-free BUG_ON(err != 0) path.
Expected behavior:
A metadata load failure during delayed-free processing should go through ext4 error handling / transaction abort / filesystem error propagation, rather than treating the error as an impossible invariant and BUGing the kernel.
The attached tarball includes README.md, repro_init.c, instrumentation.patch, and both full serial logs.
Thanks,
Chuyifei
Attachment:
ext4_free_data_buddy_load_error_bugon_20260524.tar.gz
Description: Unix tar archive