Re: [Question] Sched: Severe scheduling latency (>10s) observed on kernel 6.12 with specific workload
From: John Stultz
Date: Thu Apr 09 2026 - 17:40:02 EST
On Tue, Mar 31, 2026 at 7:32 PM Xuewen Yan <xuewen.yan94@xxxxxxxxx> wrote:
>
> I am writing to report a severe scheduling latency issue we recently
> discovered on Linux Kernel 6.12.
>
> Issue Description
>
> We observed that when running a specific background workload pattern,
> certain tasks experience excessive scheduling latency. The delay from
> the runnable state to running on the CPU exceeds 10 seconds, and in
> extreme cases, it reaches up to 100 seconds.
>
> Environment Details
>
> Kernel Version: 6.12.58-android16-6-g3835fd28159d-ab000018-4k
> Architecture: [ ARM64]
> Hardware: T7300
> Config: gki_defconfig
>
> RT-app‘s workload Pattern:
>
> {
> "tasks" : {
> "t0" : {
> "instance" : 40,
> "priority" : 0,
> "cpus" : [ 0, 1, 2, 3 ],
> "taskgroup" : "/background",
> "loop" : -1,
> "run" : 200,
> "sleep" : 50
> }
> }
> }
>
So, with this config I think I may have reproduced it on a device
(using android16-6.12). I've not quite seen 10+ seconds, but I have
seen >2second delays for kworker threads (though usually the max seems
to be around 600ms).
Unfortunately trying to reproduce using the same (andorid16-6.12)
kernel branch with qemu initially hasn't been successful (and has been
a bit of a yak shaving adventure: rt-app needs cgroupv1, which newer
debian/systemd doesn't support anylonger, so installed a debian11
image and had to build rt-app and its dependencies from source - then
found perfetto binaries require a newer glibc so had to fetch and
build perfetto from scratch as well). I can't see any similarly
sized delays there.
Out of curiosity, what are you using to detect the problem when you
have rt-app running in the background? I've been tinkering with using
cyclictest (-m -t -a --policy=SCHED_OTHER -b 1000000) to try to catch
> 1sec latencies, but curious if you had something better?
thanks
-john