Re: [PATCH v2 0/3] ksmbd: fix connection and durable handle teardown races
From: Namjae Jeon
Date: Tue Apr 28 2026 - 23:22:05 EST
On Tue, Apr 28, 2026 at 11:09 PM DaeMyung Kang <charsyam@xxxxxxxxx> wrote:
>
> This series fixes lifetime bugs around ksmbd connection shutdown,
> session file-table teardown, and durable handle scavenging.
>
> Patch 1 centralizes the final struct ksmbd_conn release so every last
> putter runs ida_destroy() and transport cleanup. The release is queued
> to a dedicated workqueue because transport teardown can sleep, while
> one known last-putter is an RCU callback.
>
> Patch 2 hardens __close_file_table_ids() by taking a transient
> ksmbd_file reference, unpublishing from the session idr under ft->lock,
> and doing sleepable preserve/close work outside ft->lock. It also
> makes the FP_NEW window visible to the opener through
> ksmbd_update_fstate().
>
> Patch 2 is scoped to file-table teardown and the FP_NEW publication
> window. It does not try to fix durable reconnect rollback on later
> smb2_open() error paths (and the related post-FP_INITED reference
> window in fresh smb2_open). Both already exist before this series
> and need either an explicit unpublish-on-error step or an extra
> session-owned reference. That is left as follow-up work. The
> FP_NEW -> FP_INITED failure reuses smb2_open()'s existing -ENOENT to
> STATUS_OBJECT_NAME_INVALID mapping to avoid changing wire behavior in
> this lifetime fix.
>
> Patch 3 closes two related races in the durable scavenger against any
> walker that iterates f_ci->m_fp_list (ksmbd_lookup_fd_inode() and the
> share-mode checks). The scavenger no longer reuses fp->node as a
> scavenger-private collect-list node, takes an explicit transient
> reference under global_ft.lock, and drops both the durable lifetime
> and transient refs with atomic_sub_and_test(2, ...) after the
> m_fp_list unlink so an in-flight m_fp_list walker that snatched fp
> owns the final close cleanly. fp->persistent_id is cleared inside
> __ksmbd_remove_durable_fd() so a delayed final close cannot re-issue
> idr_remove() on a slot that idr_alloc_cyclic() may have already
> re-handed to a new durable handle. __put_fd_final() bypasses the
> per-conn open_files_count decrement when fp is detached from any
> session table (fp->conn cleared by session_fd_check() at durable
> preserve, paired with the volatile_id clear at unpublish), since the
> walker that owns the final close in that case runs from an unrelated
> work->conn whose counter never tracked this durable fp.
>
> The series is intentionally scoped to lifetime/race fixes and the
> walker-final-putter regression that the durable scavenger handoff in
> patch 3 newly exposes. Pre-existing reconnect rollback and per-conn
> open_files_count accounting gaps are left as follow-up work so this
> series does not have to claim a full durable-reconnect or accounting
> cleanup.
>
> Validation:
> * abrupt-disconnect kmemleak/kprobe A/B for connection release
> * same-session two-tcon DEBUG_LIST/DEBUG_OBJECTS_WORK stress
> * forced durable-preserve sleep-path harness for session teardown
> * KASAN-enabled direct SMB2 coverage for session/tree teardown and
> durable-preserve paths
> * KASAN-enabled direct SMB2 coverage for durable scavenger expiry
> racing with m_fp_list lookups
> * checkpatch --strict for all patches
> * make -j$(nproc) M=fs/smb/server
>
> v1 -> v2:
> * Split the original change into bisectable patches: connection final
> release, session file-table teardown, and durable scavenger races.
> * Keep sleepable session preserve/close work out of ft->lock and make
> the FP_NEW publication race visible through a cleared volatile id.
> * Document that durable reconnect rollback on later smb2_open()
> error paths is a pre-existing follow-up item, and keep the
> existing -ENOENT to STATUS_OBJECT_NAME_INVALID wire mapping for
> this lifetime fix.
> * Avoid reusing fp->node as a temporary durable scavenger list node
> and take a transient reference in the durable scavenger so
> concurrent ksmbd_lookup_fd_inode() walkers cannot UAF on freed fp.
> These two fixes are folded into a single patch because the
> list-head-reuse fix alone leaves a deterministic UAF window for
> m_fp_list walkers; bisecting onto an intermediate state would land
> on a use-after-free that pre-patch chaos merely made less
> reproducible.
> * Clear fp->persistent_id in __ksmbd_remove_durable_fd() so a holder
> that owns the final close after a scavenger removal does not
> re-issue idr_remove() on a persistent id that may have already been
> handed out to a new durable handle.
> * Bypass the per-conn open_files_count decrement in __put_fd_final()
> when fp is detached from any session table, so an m_fp_list walker
> that owns the final close of a scavenged durable fp does not
> underflow an unrelated conn's stats counter.
> * Document the ksmbd_conn_wq lifetime invariant in ksmbd_conn_put()
> instead of guarding with WARN_ON_ONCE, so a violation surfaces as
> a NULL deref rather than a silent leak of the final release.
>
> DaeMyung Kang (3):
> ksmbd: centralize ksmbd_conn final release to plug transport leak
> ksmbd: harden file lifetime during session teardown
> ksmbd: close durable scavenger races against m_fp_list lookups
Applied them to #ksmbd-for-next-next.
Thanks!