6

Would a process with more threads on Linux have more cpu time than a process with one thread?

In Linux processes and threads are described by a task struct, and scheduling is based on tasks. I found also this:

When a new process is created, do_fork() sets the counter field of both current (the parent) and p (the child) processes in the following way:

current->counter >>= 1;
p->counter = current->counter;

In other words, the number of ticks left to the parent is split in two halves, one for the parent and one for the child. This is done to prevent users from getting an unlimited amount of CPU time by using the following method: the parent process creates a child process that runs the same code and then kills itself; by properly adjusting the creation rate, the child process would always get a fresh quantum before the quantum of its parent expires. This programming trick does not work since the kernel does not reward forks. Similarly, a user cannot hog an unfair share of the processor by starting lots of background processes in a shell or by opening a lot of windows on a graphical desktop. More generally speaking, a process cannot hog resources (unless it has privileges to give itself a real-time policy) by forking multiple descendants.

Actually I didn't find that in the kernel sources, but maybe it's my fault, maybe I saw wrong kernel version.

But what happens later, would every thread participate in scheduling like a separate process? Would a process with ten threads get ten times more ticks than a process with one thread? What about IO in this sense?

Community
  • 1
  • 1
van
  • 249
  • 2
  • 10

1 Answers1

12

Yes, a process with more threads would get more CPU time than its competitors. A well-known case would be a maven compile, maven uses lots of CPU-intensive threads, hogging the system.

But, the current linux scheduler doesn't take only tasks into account, it also takes control groups in the cpu cgroup hierarchy into account. So, CPU time is divided between control groups, and then in each control group, CPU time is divided between tasks.

Since 2.6.38, Linux automatically puts taks into different cpu cgroups based on their session ids. This means that e.g: separate tabs in konsole/gnome-terminal get their own control group. So now your maven compilation is nicely isolated, and no longer hogs the system. See the descriptions at kernelnewbies and lwn.net.

Before 2.6.38 hit most systems, Lennart Poettering showed how to do it manually on a shell script at this LKML message.

I actually have a system where I run Eclipse and maven compiles, and the change from pre-2.6.38 to pre-2.6.38 + Lennart's cgroup binding (which I put on /etc/bashrc and on my Eclipse launcher script) was just perfect. Maven no longer hogs the system (you wouldn't know there was a maven compile going on if it weren't for the CPU load monitor), and Eclipse now just hogs itself, not the rest of the system (I'll settle for that with Eclipse). Now I just need to update the kernel to one with better dirty-page writeback and that system will be a breeze to work on.

ninjalj
  • 42,493
  • 9
  • 106
  • 148
  • Thank you for such detailed answer! Does it correct that cgroups also limiting IO of groups in such way? – van Oct 04 '12 at 07:05
  • Yes, though I'm not sure of the current state of I/O schedulers support for complex hierarchies, see: http://lwn.net/Articles/427961/ – ninjalj Oct 04 '12 at 20:00
  • I try to verify your statement about cpu cgroups per session id. And as I could see it is not true for my Ubuntu 12.04 laptop http://pastebin.com/0Fjp2BQy. The situation on RedHat 6.3 even worst, all process in one cgroup "/". Is it distro specific? Or what I'm doing wrong? – van Oct 21 '12 at 17:35
  • $ cat /proc/sys/kernel/sched_autogroup_enabled is 1 – van Oct 21 '12 at 17:37
  • Not visible in the normal cgroup hierarchy, visible in: `cat /proc//autogroup`. See http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=5091faa449ee0b7d73bc296a93bca9540fc51d0a – ninjalj Oct 21 '12 at 22:06
  • Notice in particular: _At runqueue selection time, IFF a task has no cgroup assignment, its current autogroup is used._ – ninjalj Oct 21 '12 at 22:12
  • Thank you very much! One more qualification. I look into cgroups documentation here http://www.kernel.org/doc/Documentation/cgroups/cgroups.txt. And can see subsystem API and hooks on fork and exit and others, but I can't get how CPU and IO schedulers interacting with cgroups. As I could understand they should have some callbacks to consult with cgroups which process should be scheduled to run at the moment or I'am wrong somewhere again? – van Oct 22 '12 at 19:16
  • CPU and IO schedulers manage their own cgroup subsystems, e.g: `"cpu"` and `"cpuacct"` for the CPU scheduler in `kernel/sched/`. The cgroup system "calls back" to "methods" in the cgroup subsystems when there is work to do with cgroups, e.g: when a sysadmin moves a task into a cgroup. Of course the schedulers have direct access to their relevant cgroup subsystems, e.g: the CFS scheduler walks the cgroup hierarchy when distributing runtime to tasks. – ninjalj Oct 22 '12 at 22:21