Re: [PATCH 3/3] ocfs2: reject regular files with non-zero i_size and zero i_clusters
From: Joseph Qi
Date: Sun May 17 2026 - 21:38:59 EST
On 5/17/26 7:10 PM, Michael Bommarito wrote:
> On a volume mounted WITHOUT OCFS2_FEATURE_INCOMPAT_SPARSE_ALLOC, a
> regular file with non-zero i_size, zero i_clusters, and no
> OCFS2_INLINE_DATA_FL flag is structurally malformed: the extent
> map declares no allocated clusters yet the size header claims
> the file has content. ocfs2_populate_inode() copies i_size into
> the in-core inode and dispatches to ocfs2_aops; subsequent reads
> or truncates then operate on an inconsistent extent state.
>
> This is the shape an attacker who keeps the rest of the extent
> list intact (to satisfy the inline-data, refcount, chain-list,
> and per-field validators already in this function) would produce
> when forging only the inode header to publish a synthetic file
> size on a victim node. It is also the shape on-disk corruption
> of the i_clusters field produces. Reject early in the
> validator.
>
> The check is restricted to non-sparse volumes
> (ocfs2_sparse_alloc() returns false). On non-sparse mounts the
> allocator path always grows clusters before i_size:
> ocfs2_extend_file() takes the !sparse branch into
> ocfs2_extend_no_holes(), which calls ocfs2_extend_allocation()
> to journal new clusters first, and only then
> ocfs2_simple_size_update() journals the larger i_size. The
> truncate path likewise lowers i_size in ocfs2_orphan_for_truncate()
> and then frees clusters in ocfs2_commit_truncate(), which uses
> ocfs2_clusters_for_bytes(new_i_size) as its new_highest_cpos:
> when new_i_size > 0 the floor is at least one cluster, so the
> on-disk dinode never legitimately exposes a non-inline regular
> file with i_size > 0 and i_clusters == 0 on a non-sparse volume.
>
> On sparse-alloc volumes the same shape is legitimate: an
> ocfs2_extend_file() call goes through ocfs2_zero_extend() +
> ocfs2_simple_size_update(), which grows i_size on its own
> without changing i_clusters; a freshly truncate -s 1M of a
> sparse regular file is therefore on-disk
> (i_size = 1048576, i_clusters = 0). The check therefore opts
> out via ocfs2_sparse_alloc(OCFS2_SB(sb)).
>
> System inodes (OCFS2_SYSTEM_FL) carry their own size and
> cluster invariants validated by the allocator, journal, quota,
> and truncate-log subsystems; skip them here. The inline-data
> fast path is filtered separately by its own dedicated branch
> below: its well-formed case is exactly i_clusters == 0 with
> i_size <= id_count. Symlinks legitimately keep i_clusters ==
> 0 with non-zero i_size (fast symlinks), so this check is
> restricted to S_IFREG.
>
> Fixes: b657c95c1108 ("ocfs2: Wrap inode block reads in a dedicated function.")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Michael Bommarito <michael.bommarito@xxxxxxxxx>
> Assisted-by: Claude:claude-opus-4-7
Looks fine.
Reviewed-by: Joseph Qi <joseph.qi@xxxxxxxxxxxxxxxxx>
> ---
> fs/ocfs2/inode.c | 41 +++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 41 insertions(+)
>
> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
> index 305e22cc9b1d9..c63d2ced6b338 100644
> --- a/fs/ocfs2/inode.c
> +++ b/fs/ocfs2/inode.c
> @@ -1571,6 +1571,47 @@ int ocfs2_validate_inode_block(struct super_block *sb,
> goto bail;
> }
>
> + /*
> + * On a non-sparse volume, a regular file with non-zero i_size
> + * and zero i_clusters that is not marked as inline data is
> + * structurally malformed: the extent map declares no allocated
> + * clusters yet the size header claims the file has content.
> + * ocfs2_populate_inode() would still publish i_size to VFS and
> + * leave the extent state inconsistent for any later read or
> + * truncate. This is the shape an attacker who keeps the rest
> + * of the extent list intact (to satisfy the inline-data,
> + * refcount, chain-list, and per-field validators above) would
> + * produce when forging only the inode header to publish a
> + * synthetic file size on a victim node. It is also the shape
> + * on-disk corruption of the i_clusters field produces.
> + *
> + * The check opts out on sparse-alloc volumes, where the
> + * extend path (ocfs2_extend_file -> ocfs2_zero_extend ->
> + * ocfs2_simple_size_update) legitimately grows i_size without
> + * allocating clusters. On non-sparse volumes the equivalent
> + * path (ocfs2_extend_no_holes) journals clusters first and
> + * i_size second, and truncate-down floors i_clusters at
> + * ocfs2_clusters_for_bytes(new_i_size) which is >= 1 whenever
> + * new_i_size > 0, so the rejected shape never appears on disk.
> + *
> + * Skip system inodes (OCFS2_SYSTEM_FL) and the inline-data
> + * fast path (handled below). Symlinks legitimately keep
> + * i_clusters == 0 with non-zero i_size (fast symlinks), so
> + * restrict to S_IFREG.
> + */
> + if (!ocfs2_sparse_alloc(OCFS2_SB(sb)) &&
> + S_ISREG(le16_to_cpu(di->i_mode)) &&
> + !(le32_to_cpu(di->i_flags) & OCFS2_SYSTEM_FL) &&
> + !(le16_to_cpu(di->i_dyn_features) & OCFS2_INLINE_DATA_FL) &&
> + le64_to_cpu(di->i_size) != 0 &&
> + le32_to_cpu(di->i_clusters) == 0) {
> + rc = ocfs2_error(sb,
> + "Invalid dinode #%llu: regular file i_size %llu with i_clusters 0 and no inline-data flag on non-sparse volume\n",
> + (unsigned long long)bh->b_blocknr,
> + (unsigned long long)le64_to_cpu(di->i_size));
> + goto bail;
> + }
> +
> if (le16_to_cpu(di->i_dyn_features) & OCFS2_INLINE_DATA_FL) {
> struct ocfs2_inline_data *data = &di->id2.i_data;
>