Re: [RFC PATCH] ovl: keep merged and impure readdir caches separate
From: Amir Goldstein
Date: Thu May 14 2026 - 09:09:51 EST
On Thu, May 14, 2026 at 2:14 PM Nirmoy Das <nirmoyd@xxxxxxxxxx> wrote:
>
> Hi Amir,
>
> After a lot of debug and traces I found another bug in
Thanks for persisting on this problem!
> ovl_iterate_merged(): err is set from PTR_ERR(cache) before the
> IS_ERR(cache) check, so on success err holds the truncated cache
> pointer.
>
> Claude generated this:
> getdents64
> └── iterate_dir(outer_overlay_file)
> └── ovl_iterate_merged [OUTER]
> │
> ├── cache = ovl_cache_get(dentry): [OUTER ovl_cache_get]
> │ │ cache_local = kzalloc(...)
> │ │ res = ovl_dir_read_merged(...)
> │ │ │
> │ │ └── iterate_dir(inner_overlay_file)
> │ │ └── ovl_iterate_merged [INNER]
> │ │ ├── cache_inner = ovl_cache_get(...) valid 0xFFFF8881_CAC4D940
> │ │ ├── err = PTR_ERR(cache_inner) = -893068992 (low32 of ptr)
> │ │ ├── IS_ERR(cache_inner) → FALSE
> │ │ └── return err; ← leaked stale int
> │ │
> │ └── return ERR_PTR(res); res = -893068992 (int)
> │ sign-extended →
> │ (void *)0xFFFFFFFF_CAC4D940
Wow! This is demonic.
I think we need to fortify ERR_PTR() to not accept illegal errno values.
I'll try to write a patch.
> │
> ├── err = PTR_ERR(cache); err = -893068992
> ├── if (IS_ERR(cache)) ← FALSE: IS_ERR only trips on top 4K
> │ (errno window);
> │ 0xFFFFFFFF_CAC4D940 sits below
> │ return err; ← skipped
> ├── od->cache = cache; ★ corrupted pointer stored
> └── ovl_seek_cursor(od, ctx->pos)
> list_for_each(p, &od->cache->entries)
> p = *(&od->cache->entries) on bad pointer ★ PAGE FAULT
>
Oh boy!
> Sent it as a separate patch: "[PATCH] ovl: keep err zero after
> successful ovl_cache_get()".
>
> Let me know what you think.
Patch looks great - real root cause, gave you minor comments.
IIUC, as far as you know, the repro never triggered the theoretic
OVL_WHITEOUTS race?
It was always just this ERR_PTR(PTR_ERR()) black magic?
Thanks,
Amir.