[QUESTION/REGRESSION] Unbound kthreads scheduled on nohz_full CPUs after commit 041ee6f3727a
From: sheviks
Date: Sat Mar 21 2026 - 00:44:51 EST
Hi Frederic and maintainers,
I’m reaching out to discuss a change in kthread affinity behavior that
seems to be a regression for users relying on dynamic CPU isolation.
This started appearing after commit 041ee6f3727a ("kthread: Rely on
HK_TYPE_DOMAIN for preferred affinity management").
The Problem:
In my setup, I use nohz_full but intentionally avoid the deprecated
isolcpus= boot parameter. Instead, I use cgroup v2
(cpuset.cpus.partition=isolated) to dynamically isolate CPUs after the
system has booted.
The commit 041ee6f3727a changed kthreads to rely on HK_TYPE_DOMAIN.
However, since isolcpus= is not used, HK_TYPE_DOMAIN defaults to all
CPUs at boot time. Even after I later configure cgroups to isolate
CPUs 1-7, unbound kthreads (including kthreadd) remain on those
nohz_full CPUs.
It seems the assumption that "nohz_full implies domain isolation" only
holds true if isolation is statically defined at boot via isolcpus=.
For dynamic isolation via cgroups, HK_TYPE_KTHREAD and HK_TYPE_DOMAIN
no longer cover the same set of CPUs.
System Log:
Here is the state of my system after setting up the cgroup isolation:
$ uname -r
7.0.0-rc4-1-rt
# 1. Boot parameters (No isolcpus)
$ grep -oe "nohz_full=[^ ]*" -e "rcu_nocbs=[^ ]*" /proc/cmdline
nohz_full=1,2,3,4,5,6,7
rcu_nocbs=1,2,3,4,5,6,7
# 2. Cgroup v2 Isolation is active
$ cat /sys/fs/cgroup/isolated1.slice/cpuset.cpus.exclusive
1-7
$ cat /sys/fs/cgroup/isolated1.slice/cpuset.cpus.partition
isolated
$ cat /sys/fs/cgroup/cpuset.cpus.effective
0
$ cat /sys/fs/cgroup/cpuset.cpus.isolated
1-7
# 3. Unbound kthreads are still "trapped" on isolated/nohz_full CPUs
$ ps -eLo cpuid,comm | grep -e COMM -e "^ *[1-7] " | grep -ve
"/[1-7]$" -e "kworker/[1-7]:" | head
CPUID COMMAND
4 pool_workqueue_release
1 pr/legacy
4 rcu_exp_gp_kthread_worker
1 kdevtmpfs
5 oom_reaper
1 ksmd
7 watchdogd
7 kswapd0
6 scsi_eh_0
Questions:
1. Is this an intended change that mandates the use of isolcpus= for
kthread exclusion?
2. If we prefer dynamic isolation via cgroup v2, is there a
recommended way to "refresh" or move these unbound kthreads once the
housekeeping mask changes at runtime?
3. Or should HK_TYPE_KTHREAD still be considered separately from
HK_TYPE_DOMAIN to account for nohz_full users without isolcpus=?
I would appreciate any insights or suggestions you might have.
Best regards,
Sheviks
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
乾淨無病毒。www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>