how does ps work?
The way of learning standard utils - is to check their source code. There are several implementations of ps
: procps
and busybox; and busybox is smaller and it will be easier to begin with it. There is sources for ps
from busybox: https://git.busybox.net/busybox/tree/procps. Main loop from ps.c
:
635 p = NULL;
636 while ((p = procps_scan(p, need_flags)) != NULL) {
637 format_process(p);
638 }
Implementation of procps_scan
is in procps.c
(ignore code from inside ENABLE_FEATURE_SHOW_THREADS
ifdefs for first time). First call to it will open /proc
dir using alloc_procps_scan()
:
285 sp = alloc_procps_scan();
94 sp->dir = xopendir("/proc");
Then procps_scan
will read next entry from /proc
directory:
287 for (;;) {
305 entry = readdir(sp->dir);
parse the pid from subdirectory name:
311 pid = bb_strtou(entry->d_name, NULL, 10);
and read /prod/pid/stat
:
361 /* These are all retrieved from proc/NN/stat in one go: */
375 /* see proc(5) for some details on this */
376 strcpy(filename_tail, "stat");
377 n = read_to_buf(filename, buf);
Actual unconditional printing is in format_process
, ps.c
.
So, busybox's simple ps will read data for all processes, and will print all processes (or all processes and all threads if there will be -T
option).
how can I select only those processes which are active when psmod is called?
What is "active"? If you want find all processes that exists, do readdir of /proc
. If you want to find only non-sleeping, do full read of /proc
, check states of every process and print only non-sleeping. The /proc
fs is virtual and is it rather fast.
PS: for example, normal ps
program prints only processes from current terminal, usually two:
$ ps
PID TTY TIME CMD
7925 pts/13 00:00:00 bash
7940 pts/13 00:00:00 ps
but we can strace
it with strace -ttt -o ps.log ps
and I see that ps
does read every process directory, files stat
and status
. And the time needed for this (option -tt
of strace gives us timestamps of every syscall): XX.719011 - XX.870349 or just 120 ms under strace (which slows all syscalls). It takes only 20 ms in real life according to time ps
(I have 250 processes in total):
$ time ps
PID TTY TIME CMD
7925 pts/13 00:00:00 bash
7971 pts/13 00:00:00 ps
real 0m0.021s
user 0m0.006s
sys 0m0.014s