I want to use ftrace to trace a fix process’s wakeup latency. But, ftrace will only record the max latency. And, set_ftrace_pid is useless.
Does anybody know how to do that?
Thank you very much.
I want to use ftrace to trace a fix process’s wakeup latency. But, ftrace will only record the max latency. And, set_ftrace_pid is useless.
Does anybody know how to do that?
Thank you very much.
You can use the tool I wrote to trace a fix process’s wakeup latency.
https://gitee.com/openeuler-competition/summer2021-42
This tool supports analyzing the overall scheduling latency of the system through ftrace raw data.
Adding to the suggestion added by @Qeole, you can also use the perf sched
utility to obtain a much detailed trace of a process' wakeup latency. While ebpf
tools like runqlat will give you a much higher level overview, using perf sched
will help you capture all scheduler events and thereby, observe and inspect the wakeup latencies of a process in much more detail. Note that running perf sched
to monitor a long-running computationally intensive process, will come with its own issues of overhead.
You first need to run perf sched record
-
From the man-page,
'perf sched record <command>' records the scheduling events of an arbitrary workload.
For eg. say you want to trace the wakeup latencies of the command ls
.
sudo perf sched record ls
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.453 MB perf.data (562 samples) ]
You will see that in the same directory where the command was run, a perf.data
file will be generated. This file will contain all of the raw scheduler events, and the commands below will help to make sense of all these scheduler events.
perf sched latency
to obtain per-task latency summaries, including details of the number of context switches per task, average and maximum delay.sudo perf sched latency
-----------------------------------------------------------------------------------------------------------------
Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | Maximum delay at |
-----------------------------------------------------------------------------------------------------------------
migration/4:35 | 0.000 ms | 1 | avg: 0.003 ms | max: 0.003 ms | max at: 231259.727951 s
kworker/u16:0-p:6962 | 0.103 ms | 20 | avg: 0.003 ms | max: 0.035 ms | max at: 231259.729314 s
ls:7118 | 1.752 ms | 1 | avg: 0.003 ms | max: 0.003 ms | max at: 231259.727898 s
alsa-sink-Gener:3133 | 0.000 ms | 1 | avg: 0.003 ms | max: 0.003 ms | max at: 231259.729321 s
Timer:5229 | 0.035 ms | 1 | avg: 0.002 ms | max: 0.002 ms | max at: 231259.729625 s
AudioIP~ent RPC:7597 | 0.040 ms | 1 | avg: 0.002 ms | max: 0.002 ms | max at: 231259.729698 s
MediaTimer #1:7075 | 0.025 ms | 1 | avg: 0.002 ms | max: 0.002 ms | max at: 231259.729651 s
gnome-terminal-:4989 | 0.254 ms | 24 | avg: 0.001 ms | max: 0.003 ms | max at: 231259.729358 s
MediaPl~back #3:7098 | 0.034 ms | 1 | avg: 0.001 ms | max: 0.001 ms | max at: 231259.729670 s
kworker/u16:2-p:5987 | 0.144 ms | 32 | avg: 0.001 ms | max: 0.002 ms | max at: 231259.729193 s
perf:7114 | 3.503 ms | 1 | avg: 0.001 ms | max: 0.001 ms | max at: 231259.729656 s
kworker/u16:1-p:7112 | 0.184 ms | 52 | avg: 0.001 ms | max: 0.001 ms | max at: 231259.729201 s
chrome:5713 | 0.067 ms | 1 | avg: 0.000 ms | max: 0.000 ms | max at: 0.000000 s
-----------------------------------------------------------------------------------------------------------------
TOTAL: | 6.141 ms | 137 |
---------------------------------------------------
You can see the process ls
, as well as the process perf
being present among all the other processes that co-existed at the same time while the perf sched record
command was being run.
perf sched timehist
to obtain a detailed summary of the individual scheduler events. sudo perf sched timehist
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ --------- --------- ---------
231259.726350 [0005] <idle> 0.000 0.000 0.000
231259.726465 [0005] chrome[5713] 0.000 0.000 0.114
231259.727447 [0005] <idle> 0.114 0.000 0.981
231259.727513 [0005] chrome[5713] 0.981 0.000 0.066
231259.727898 [0004] <idle> 0.000 0.000 0.000
231259.727951 [0004] perf[7118] 0.000 0.002 0.052
231259.727958 [0002] perf[7114] 0.000 0.000 0.000
231259.727960 [0000] <idle> 0.000 0.000 0.000
231259.727964 [0004] migration/4[35] 0.000 0.002 0.013
231259.729193 [0006] <idle> 0.000 0.000 0.000
231259.729201 [0002] <idle> 0.000 0.000 1.242
231259.729201 [0003] <idle> 0.000 0.000 0.000 231259.729216 [0002] kworker/u16:1-p[7112] 0.006 0.001 0.005
231259.729219 [0002] <idle> 0.005 0.000 0.002
231259.729222 [0002] kworker/u16:1-p[7112] 0.002 0.000 0.002
231259.729222 [0006] <idle> 0.001 0.000 0.007
The wait time
refers to the time the task was waiting to be woken up, and the sch delay
is the time it took for the scheduler to actually schedule it into the run queue after the task was woken up.
You can filter the timehist command by pid
and since the pid
of the ls
command was 7118 (you can observe this in the perf sched latency
output).
sudo perf sched timehist -p 7118
Samples do not have callchains.
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ --------- --------- ---------
231259.727951 [0004] perf[7118] 0.000 0.002 0.052
231259.729657 [0000] ls[7118] 0.009 0.000 1.697
Now, in order to observe the wakeup events for this process, you can run add a command line switch -w
to the previous command -
sudo perf sched timehist -p 7118 -w
Samples do not have callchains.
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ --------- --------- ---------
231259.727895 [0002] perf[7114] awakened: perf[7118]
231259.727948 [0004] perf[7118] awakened: migration/4[35]
231259.727951 [0004] perf[7118] 0.000 0.002 0.052
231259.729190 [0000] ls[7118] awakened: kworker/u16:2-p[5987]
231259.729199 [0000] ls[7118] awakened: kworker/u16:1-p[7112]
231259.729207 [0000] ls[7118] awakened: kworker/u16:2-p[5987]
231259.729209 [0000] ls[7118] awakened: kworker/u16:1-p[7112]
231259.729212 [0000] ls[7118] awakened: kworker/u16:2-p[5987]
231259.729218 [0000] ls[7118] awakened: kworker/u16:1-p[7112]
231259.729221 [0000] ls[7118] awakened: kworker/u16:2-p[5987]
231259.729223 [0000] ls[7118] awakened: kworker/u16:1-p[7112]
231259.729226 [0000] ls[7118] awakened: kworker/u16:1-p[7112]
231259.729231 [0000] ls[7118] awakened: kworker/u16:1-p[7112]
231259.729233 [0000] ls[7118] awakened: kworker/u16:2-p[5987]
231259.729237 [0000] ls[7118] awakened: kworker/u16:2-p[5987]
231259.729240 [0000] ls[7118] awakened: kworker/u16:1-p[7112]
231259.729242 [0000] ls[7118] awakened: kworker/u16:2-p[5987]
-------------------------------------- # some other events here
231259.729548 [0000] ls[7118] awakened: kworker/u16:0-p[6962]
231259.729553 [0000] ls[7118] awakened: kworker/u16:1-p[7112]
231259.729555 [0000] ls[7118] awakened: kworker/u16:0-p[6962]
231259.729557 [0000] ls[7118] awakened: kworker/u16:1-p[7112]
231259.729562 [0000] ls[7118] awakened: kworker/u16:0-p[6962]
231259.729564 [0000] ls[7118] awakened: kworker/u16:1-p[7112]
231259.729655 [0000] ls[7118] awakened: perf[7114]
231259.729657 [0000] ls[7118] 0.009 0.000 1.697
The kworker
threads interrupt the initial execution of perf
and its child process ls
at 231259.729190 ms. You can see that the perf
process gets woken up eventually, to be actually executed at 231259.729655 ms after all of the kernel worker threads have done some work. You can get a more detailed CPU visualization of the above timehist
details using the below command -
sudo perf sched timehist -p 7118 -wV
Samples do not have callchains.
time cpu 012345678 task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ --------- ------------------------------ --------- --------- ---------
231259.727895 [0002] perf[7114] awakened: perf[7118]
231259.727948 [0004] perf[7118] awakened: migration/4[35]
231259.727951 [0004] s perf[7118] 0.000 0.002 0.052
231259.729190 [0000] ls[7118] awakened: kworker/u16:2-p[5987]
231259.729199 [0000] ls[7118] awakened: kworker/u16:1-p[7112]
231259.729207 [0000] ls[7118] awakened: kworker/u16:2-p[5987]
231259.729209 [0000] ls[7118] awakened: kworker/u16:1-p[7112]
231259.729212 [0000] ls[7118] awakened: kworker/u16:2-p[5987]
-------------------------------------------------- # some other events here
231259.729562 [0000] ls[7118] awakened: kworker/u16:0-p[6962]
231259.729564 [0000] ls[7118] awakened: kworker/u16:1-p[7112]
231259.729655 [0000] ls[7118] awakened: perf[7114]
231259.729657 [0000] s ls[7118] 0.009 0.000 1.697
The CPU visualization column ("012345678") has "s" for context-switch events, which indicates that first, CPU 4 and then, CPU 0 was context switching to the ls
process.
Note : You can supplement the above information with outputs from the remaining commands of perf sched
, namely perf sched script
and perf sched map
.