Re: [RFC PATCH] ovl: keep merged and impure readdir caches separate

From: Amir Goldstein

Date: Thu May 14 2026 - 09:09:51 EST


On Thu, May 14, 2026 at 2:14 PM Nirmoy Das <nirmoyd@xxxxxxxxxx> wrote:
>
> Hi Amir,
>
> After a lot of debug and traces I found another bug in

Thanks for persisting on this problem!

> ovl_iterate_merged(): err is set from PTR_ERR(cache) before the
> IS_ERR(cache) check, so on success err holds the truncated cache
> pointer.
>
> Claude generated this:
> getdents64
> └── iterate_dir(outer_overlay_file)
> └── ovl_iterate_merged [OUTER]
>
> ├── cache = ovl_cache_get(dentry): [OUTER ovl_cache_get]
> │ │ cache_local = kzalloc(...)
> │ │ res = ovl_dir_read_merged(...)
> │ │ │
> │ │ └── iterate_dir(inner_overlay_file)
> │ │ └── ovl_iterate_merged [INNER]
> │ │ ├── cache_inner = ovl_cache_get(...) valid 0xFFFF8881_CAC4D940
> │ │ ├── err = PTR_ERR(cache_inner) = -893068992 (low32 of ptr)
> │ │ ├── IS_ERR(cache_inner) → FALSE
> │ │ └── return err; ← leaked stale int
> │ │
> │ └── return ERR_PTR(res); res = -893068992 (int)
> │ sign-extended →
> │ (void *)0xFFFFFFFF_CAC4D940

Wow! This is demonic.
I think we need to fortify ERR_PTR() to not accept illegal errno values.
I'll try to write a patch.

>
> ├── err = PTR_ERR(cache); err = -893068992
> ├── if (IS_ERR(cache)) ← FALSE: IS_ERR only trips on top 4K
> │ (errno window);
> │ 0xFFFFFFFF_CAC4D940 sits below
> │ return err; ← skipped
> ├── od->cache = cache; ★ corrupted pointer stored
> └── ovl_seek_cursor(od, ctx->pos)
> list_for_each(p, &od->cache->entries)
> p = *(&od->cache->entries) on bad pointer ★ PAGE FAULT
>

Oh boy!

> Sent it as a separate patch: "[PATCH] ovl: keep err zero after
> successful ovl_cache_get()".
>
> Let me know what you think.

Patch looks great - real root cause, gave you minor comments.

IIUC, as far as you know, the repro never triggered the theoretic
OVL_WHITEOUTS race?

It was always just this ERR_PTR(PTR_ERR()) black magic?

Thanks,
Amir.