0

Using Linux

$ uname -r
4.4.0-1041-aws
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.3 LTS
Release:    16.04
Codename:   xenial

With limits allowing up to 200k processes

$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 563048
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 524288
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
$ cat /proc/sys/kernel/pid_max
200000
$ cat /proc/sys/kernel/threads-max
1126097

And enough free memory to give 1MB each to 127k processes

$ free
              total        used        free      shared  buff/cache   available
Mem:      144156492     5382168   130458252      575604     8316072   137302624
Swap:             0           0           0

And I have fewer than 1k existing processes/threads.

$ ps -elfT | wc -l
832

But I cannot start 50k processes

$ echo '
seq 50000 | while read _; do
    sleep 20 &
done
' | bash
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
...

Why can't I create 50k processes?

Paul Draper
  • 78,542
  • 46
  • 206
  • 285
  • What is the *actual* use case? What is the *real* program you want to run? Perhaps *improve* your question – Basile Starynkevitch Dec 15 '17 at 20:45
  • @BasileStarynkevitch, those questions are generally overly broad, at they involve opinions, big-picture design, situational details, etc. and they are OT. This is extremely clear: what prevents 50k processes in Linux? – Paul Draper Dec 15 '17 at 20:49
  • The `fork` documentation gives several reasons for `EAGAIN`. I don't know which one applies for your case. I recommend asking sysadmin help (perhaps to Amazon). – Basile Starynkevitch Dec 15 '17 at 20:50

3 Answers3

1

It was caused by Linux cancer systemd.

In addition to kernel.pid_max and ulimit, I also needed to change a third limit.

/etc/systemd/logind.conf

[Login]
UserTasksMax=70000

And then restart.

Paul Draper
  • 78,542
  • 46
  • 206
  • 285
0

Because each process requires some resources: some RAM (including some kernel memory), some CPU, etc.

Each process has its own virtual address space, including its own call stack (and some of it requires physical resources, including several pages of RAM; read more about resident set size; on my desktop the RSS of some bash process is about 6Mbytes). So a process is actually some quite heavy stuff.

BTW, this is not specific to Linux.

Read more about operating systems, e.g. Operating Systems : Three Easy Pieces

Try also cat /proc/$$/maps and cat /proc/$$/status and read more about proc(5). Read about failure of fork(2) and of execve(2). The resource temporarily unavailable is for EAGAIN (see errno(3)), and several reasons can make fork fail with EAGAIN. And on my system, cat /proc/sys/kernel/pid_max gives 32768 (and reaching that limit gives EAGAIN for fork).

BTW, imagine if you could fork ten thousand processes. Then the context switch time would be dominant w.r.t. to running time.

Your Linux system looks like some AWS instance. Amazon won't let you create that much processes, because their hardware is not expecting that much.

(on some costly supercomputer or server with e.g. a terabyte of RAM and a hundred of cores, perhaps you could run 50K processes; I guess that they need some particular kernel, or kernel configuration. I recommend getting help from Amazon support)

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • I don't think the processes are using enough to run out of RAM. As the OP points out there is enough memory for each process to have over 1MB of resident memory. As far as I can tell the sleep processes in the example only use ~700KB of resident memory. In linux at least the executable code is shared between processes that use the same binary or .so. – Thayne Dec 15 '17 at 20:27
  • (1) /proc/sys/kernel/pid_max is 200k as I stated (increased with sysctl) (2) "on some costly supercomputer or server with e.g. a terabyte of RAM and a hundred of cores" I have 72 cores and 144GB of RAM. I've tried more RAM but it doesn't seem to make a difference. – Paul Draper Dec 15 '17 at 20:44
  • Then examine all the other `fork` possible failures. Perhaps contact Amazon to get some help. BTW, you should improve your question (to give those details), not comment my answer – Basile Starynkevitch Dec 15 '17 at 20:44
0

Building on @Basile's answer, you probably ran out of pids.

cat /proc/sys/kernel/pid_max gives me 32768 on my machine (maximum value of a signed short). which is less than 50k

EDIT: I missed that /proc/sys/kernel/pid_max is set to 200000. That probably isn't the issue in this case.

Thayne
  • 6,619
  • 2
  • 42
  • 67