0

I'm trying to implement my own ps command, called psmod. I can use linux system call and all utilities of the /proc directory.

I discovered that all directory in /proc directory with a number as their name are the processes in the system. My question is: how can I select only those processes which are active when psmod is called? I know that in /proc/<pid>/stat there's a letter representing the current status of the process; anyway, for every process in /proc, this letter is S, that is sleeping.

I also tried to send a signal 0 to every process, from 0 to the maximumnumberofprocesses (in my case, 32768), but in this way it discovers far more processes than the ones present in /proc.

So, my question is, how does ps work? The source is a little too complicated for me, so if someone can explain me, I would be grateful.

lhf
  • 70,581
  • 9
  • 108
  • 149
Michael
  • 876
  • 9
  • 29
  • "it discovers far more processes" I think you can discover some threads too. They occupy some pids, check `/proc/pid/tasks` subdirectories to get list of them. – osgx May 12 '14 at 12:42
  • "*The source is a little too complicated for me ...*" this is a bad prerequisite to implement your own `ps`. So bite and go through `ps`'s code, that's a perfect training!-) – alk May 12 '14 at 13:38
  • "how can I select only those processes which are active"... That's a pretty vague condition. If you could inspect all of `/proc` instantaneously (you can't), on an N-CPU system, at most N processes would show up as "active", one of which would be your program. Since you have to iterate through `/proc`, though, there's all kinds of races in figuring out what is active - by the time you report it, it may no longer be active, and other things will have taken their place... Better to load them all and then sort through what you want. – twalberg May 12 '14 at 14:07

2 Answers2

8

how does ps work?

The way of learning standard utils - is to check their source code. There are several implementations of ps: procps and busybox; and busybox is smaller and it will be easier to begin with it. There is sources for ps from busybox: https://git.busybox.net/busybox/tree/procps. Main loop from ps.c:

635 p = NULL;
636 while ((p = procps_scan(p, need_flags)) != NULL) {
637     format_process(p);
638 }

Implementation of procps_scan is in procps.c (ignore code from inside ENABLE_FEATURE_SHOW_THREADS ifdefs for first time). First call to it will open /proc dir using alloc_procps_scan():

285     sp = alloc_procps_scan();

 94     sp->dir = xopendir("/proc");

Then procps_scan will read next entry from /proc directory:

287 for (;;) {
305     entry = readdir(sp->dir);

parse the pid from subdirectory name:

311     pid = bb_strtou(entry->d_name, NULL, 10);

and read /prod/pid/stat:

361     /* These are all retrieved from proc/NN/stat in one go: */
375         /* see proc(5) for some details on this */
376         strcpy(filename_tail, "stat");
377         n = read_to_buf(filename, buf);

Actual unconditional printing is in format_process, ps.c.

So, busybox's simple ps will read data for all processes, and will print all processes (or all processes and all threads if there will be -T option).

how can I select only those processes which are active when psmod is called?

What is "active"? If you want find all processes that exists, do readdir of /proc. If you want to find only non-sleeping, do full read of /proc, check states of every process and print only non-sleeping. The /proc fs is virtual and is it rather fast.

PS: for example, normal ps program prints only processes from current terminal, usually two:

$ ps
  PID TTY          TIME CMD
 7925 pts/13   00:00:00 bash
 7940 pts/13   00:00:00 ps

but we can strace it with strace -ttt -o ps.log ps and I see that ps does read every process directory, files stat and status. And the time needed for this (option -tt of strace gives us timestamps of every syscall): XX.719011 - XX.870349 or just 120 ms under strace (which slows all syscalls). It takes only 20 ms in real life according to time ps (I have 250 processes in total):

$ time ps
  PID TTY          TIME CMD
 7925 pts/13   00:00:00 bash
 7971 pts/13   00:00:00 ps

real    0m0.021s
user    0m0.006s
sys 0m0.014s
Community
  • 1
  • 1
osgx
  • 90,338
  • 53
  • 357
  • 513
-3

"My question is: how can I select only those processes which are active when psmod is called?"

I hope this command will help you:

top -n 1 | awk "NR > 7" | awk {'print $1,$8,$12'} | grep R

I am on ubuntu 12.

Max
  • 1,150
  • 1
  • 10
  • 16
  • Max, I think Michael needs solution in plain C without using of top or ps. Also `top -n` doesn't print all processes, it prints only one screen of them. There can be more active processes than 1 screen. – osgx May 12 '14 at 12:46