Re: [PATCH] ext4: skip split extent recovery on corruption
From: Zhang Yi
Date: Mon Mar 23 2026 - 08:19:28 EST
On 3/23/2026 9:57 AM, hongao wrote:
> ext4_split_extent_at() retries after ext4_ext_insert_extent() fails by
> refinding the original extent and restoring its length. That recovery is
> only safe for transient resource failures such as -ENOSPC, -EDQUOT, and
> -ENOMEM.
>
> When ext4_ext_insert_extent() fails because the extent tree is already
> corrupted, ext4_find_extent() can return a leaf path without p_ext.
> ext4_split_extent_at() then dereferences path[depth].p_ext while trying to
> fix up the original extent length, causing a NULL pointer dereference while
> handling a pre-existing filesystem corruption.
>
> Do not enter the recovery path for corruption errors, and validate p_ext
> after refinding the extent before touching it. This keeps the recovery path
> limited to cases it can actually repair and turns the syzbot-triggered crash
> into a proper corruption report.
>
> Fixes: 716b9c23b862 ("ext4: refactor split and convert extents")
> Reported-by: syzbot+1ffa5d865557e51cb604@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://syzkaller.appspot.com/bug?extid=1ffa5d865557e51cb604
> Signed-off-by: hongao <hongao@xxxxxxxxxxxxx>
Thank you for the patch, this is a nice catch. This overall looks good
to me besides one minor suggestion below.
Reviewed-by: Zhang Yi <yi.zhang@xxxxxxxxxx>
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index ae3804f36535..aba9947fd151 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3239,6 +3239,9 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
>
> insert_err = PTR_ERR(path);
> err = 0;
> + if (insert_err != -ENOSPC && insert_err != -EDQUOT &&
> + insert_err != -ENOMEM)
> + goto out_path;
>
> /*
> * Get a new path to try to zeroout or fix the extent length.
> @@ -3261,6 +3264,13 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
>
> depth = ext_depth(inode);
> ex = path[depth].p_ext;
> + if (!ex) {
> + EXT4_ERROR_INODE(inode,
> + "bad extent address lblock: %lu, depth: %d pblock %lld",
> + (unsigned long)ee_block, depth, path[depth].p_block);
> + err = -EFSCORRUPTED;
> + goto out;
> + }
Should we move the entire code snippet before the previous
ext4_ext_get_access() call? Although this is not closely related to the
current patch.
>
> fix_extent_len:
> ex->ee_len = orig_ex.ee_len;