Re: [QUESTION/REGRESSION] Unbound kthreads scheduled on nohz_full CPUs after commit 041ee6f3727a
From: Waiman Long
Date: Sun Mar 22 2026 - 12:11:39 EST
On 3/21/26 12:44 AM, sheviks wrote:
Hi Frederic and maintainers,
I’m reaching out to discuss a change in kthread affinity behavior that
seems to be a regression for users relying on dynamic CPU isolation.
This started appearing after commit 041ee6f3727a ("kthread: Rely on
HK_TYPE_DOMAIN for preferred affinity management").
The Problem:
In my setup, I use nohz_full but intentionally avoid the deprecated
isolcpus= boot parameter. Instead, I use cgroup v2
(cpuset.cpus.partition=isolated) to dynamically isolate CPUs after the
system has booted.
The commit 041ee6f3727a changed kthreads to rely on HK_TYPE_DOMAIN.
However, since isolcpus= is not used, HK_TYPE_DOMAIN defaults to all
CPUs at boot time. Even after I later configure cgroups to isolate
CPUs 1-7, unbound kthreads (including kthreadd) remain on those
nohz_full CPUs.
Frederic's patch series is supposed to make HK_TYPE_DOMAIN cpumask to dynamically exclude cpuset isolated CPUs. Then those unbound kthreads are supposed to be modified to remove these CPUs from their cpumasks. If that is not happening, it will be a problem we need to look at.
Cheers,
Longman
It seems the assumption that "nohz_full implies domain isolation" only
holds true if isolation is statically defined at boot via isolcpus=.
For dynamic isolation via cgroups, HK_TYPE_KTHREAD and HK_TYPE_DOMAIN
no longer cover the same set of CPUs.
System Log:
Here is the state of my system after setting up the cgroup isolation:
$ uname -r
7.0.0-rc4-1-rt
# 1. Boot parameters (No isolcpus)
$ grep -oe "nohz_full=[^ ]*" -e "rcu_nocbs=[^ ]*" /proc/cmdline
nohz_full=1,2,3,4,5,6,7
rcu_nocbs=1,2,3,4,5,6,7
# 2. Cgroup v2 Isolation is active
$ cat /sys/fs/cgroup/isolated1.slice/cpuset.cpus.exclusive
1-7
$ cat /sys/fs/cgroup/isolated1.slice/cpuset.cpus.partition
isolated
$ cat /sys/fs/cgroup/cpuset.cpus.effective
0
$ cat /sys/fs/cgroup/cpuset.cpus.isolated
1-7
# 3. Unbound kthreads are still "trapped" on isolated/nohz_full CPUs
$ ps -eLo cpuid,comm | grep -e COMM -e "^ *[1-7] " | grep -ve
"/[1-7]$" -e "kworker/[1-7]:" | head
CPUID COMMAND
4 pool_workqueue_release
1 pr/legacy
4 rcu_exp_gp_kthread_worker
1 kdevtmpfs
5 oom_reaper
1 ksmd
7 watchdogd
7 kswapd0
6 scsi_eh_0
Questions:
1. Is this an intended change that mandates the use of isolcpus= for
kthread exclusion?
2. If we prefer dynamic isolation via cgroup v2, is there a
recommended way to "refresh" or move these unbound kthreads once the
housekeeping mask changes at runtime?
3. Or should HK_TYPE_KTHREAD still be considered separately from
HK_TYPE_DOMAIN to account for nohz_full users without isolcpus=?
I would appreciate any insights or suggestions you might have.
Best regards,
Sheviks
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
乾淨無病毒。www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>