How are nice priorities and scheduler policies related to process (thread?) IDs in linux?

Question

I am investigating how to have my Linux desktop experience remain smooth and interactive while I run CPU intensive tasks in the background. Here is the sample program (written in Java) which I am using to simulate CPU load:

public class Spinner {
    public static void main(String[] args)
    {
        for (int i = 0; i < 100; i++) {
            (new Thread(new Runnable() {
                    public void run() {
                        while (true);
                    }
            })).start();
        }
    }
}

When I run this on the command line, I notice that the interactivity of my desktop applicaitons (e.g. text editor) drops significantly. I have a dual core machine, so I am not suprised by this.

To combat this my first thought was to nice the process with renice -p 20 <pid>. I found however that this doesn't have much affect. I instead have to renice all of the child processes with something like ls /proc/<pid>/task | xargs renice 20 -p -- which has a much greater effect.

I am very confused by this, as I would not expect threads to have their own process IDs. Even if they did I would expect renice to act on the entire process, not just the main thread of the process.

Does anyone have a clear understanding of what is happening here? It appears that each thread is actually a seperate process (at least it has a valid PID). I knew that historically Linux worked like this, but I believed NPTL fixed that years ago.

I am testing on RHEL 5.4 (linux kernel 2.6.18).

(As an aside. I notice the same effect if I try to use sched_setscheduler(<pid>, SCHED_BATCH, ..) to try to solve this interactivity problem. Ie, I need to make this call for all the "child" processes I seee in /proc/<pid>/task, it is not enough to perform it once on the main program pid.)

score 2 · Accepted Answer · answered Jul 28 '11 at 05:08

2

Thread IDs come from the same namespace as PIDs. This means that each thread is invididually addressable by its TID - some system calls do apply to the entire process (for example, kill) but others apply only to a single thread.

The scheduler system calls are generally in the latter class, because this allows you to give different threads within a process different scheduler attributes, which is often useful.

answered Jul 28 '11 at 05:08

caf

233,326
40
323
462

Your explanation is consistent with what I observer. It's a shame that the man pages for nice and sched_setparam() refer to PID and not TID. It's hella confusing. – pauldoo Jul 28 '11 at 09:01
@pauldoo: Indeed. You can [report bugs in the Linux man pages](http://www.kernel.org/doc/man-pages/reporting_bugs.html). – caf Jul 28 '11 at 09:15
As a final comment, I recently discovered `gettid` [http://www.kernel.org/doc/man-pages/online/pages/man2/gettid.2.html] which obtains the TID of the current thread. On the main thread of the application this should return the same value as the familiar `getpid`. – pauldoo Aug 24 '11 at 08:40

timday · Answer 2 · 2011-07-28T22:29:20.057

As I understand it, on Linux threads and processes are pretty much the same thing; threads just happen to be processes which share the same memory rather than doing fork's copy-on-write thing, and fork(2) and pthread_create(3) are presumably both just layered onto a call to clone(2) with different arguments.

The scheduling stuff is very confusing because e.g the pthreads(7) man page starts off by telling you Posix threads share a common nice value but then you have to get down to

NPTL still has a few non-conformances with POSIX.1: Threads do not share a common nice value

to see the whole picture (and I'm sure there are plenty of even less helpful man pages).

I've written GUI apps which spawn multiple compute threads from a main UI thread, and have always found the key to getting the app to remain very responsive is to invoke nice(2) in the compute threads (only); increasing it by 4 or so seems to work well.

Or at least that's what I remembered doing. I just looked at the code for the first time in years and see what I actually did was this:

// Note that this code relies on Linux NPTL's non-Posix-compliant
// thread-specific nice value (although without a suitable replacement
// per-thread priority mechanism it's just as well it's that way).
// TODO: Should check some error codes,
// but it's probably pretty harmless if it fails.

  const int current_priority=getpriority(PRIO_PROCESS,0);
  setpriority(PRIO_PROCESS,0,std::min(19u,current_priority+n));

Which is interesting. I probably tried nice(2) and found it did actually apply to the whole process (all threads), which wasn't what I wanted (but maybe you do). But this is going back years now; behaviour might have changed since.

One essential tool when you're playing with this sort of stuff: if you hit 'H' (NB not 'h') in top(1), it changes from process view to showing all the threads and the individual thread nice values. e.g If I run [evolvotron][7] -t 4 -n 5 (4 compute threads at nice 5) I see (I'm just on an old single core non-HT machine, so not actually much point in multiple threads here):

Tasks: 249 total,   5 running, 244 sleeping,   0 stopped,   0 zombie
Cpu(s): 17.5%us,  6.3%sy, 76.2%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   1025264k total,   984316k used,    40948k free,    96136k buffers
Swap:  1646620k total,        0k used,  1646620k free,   388596k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
 4911 local     25   5 81096  23m  15m R 19.7  2.4   0:04.03 evolvotron         
 4912 local     25   5 81096  23m  15m R 19.7  2.4   0:04.20 evolvotron         
 4913 local     25   5 81096  23m  15m R 19.7  2.4   0:04.08 evolvotron         
 4914 local     25   5 81096  23m  15m R 19.7  2.4   0:04.19 evolvotron         
 4910 local     20   0 81096  23m  15m S  9.8  2.4   0:05.83 evolvotron         
 ...

Threads are more than just processes that share the same `mm` and file table. Internally they are members of the same task group, which has other effects - for example, process-directed signals can be handled by any thread within the task group; if any thread calls the `exit_group()` syscall (mapped to the `exit()` wrapper in glibc) the entire task group exits; a process-terminating signal like `SIGSEGV` will terminate the entire task group; only the thread group leader gets a top-level directory in `/proc`. — caf, Jul 29 '11 at 04:33
Cool thanks for the 'top' trick. This seems to confirm the behaviour I see. Shame I don't see a way for top to show scheduler policy for each thread. — pauldoo, Jul 29 '11 at 08:40

How are nice priorities and scheduler policies related to process (thread?) IDs in linux?

2 Answers2