5

The following code is supposed to make 100,000 threads:

/* compile with:   gcc -lpthread -o thread-limit thread-limit.c */
/* originally from: http://www.volano.com/linuxnotes.html */

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
#include <string.h>

#define MAX_THREADS 100000
int i;

void run(void) {
  sleep(60 * 60);
}

int main(int argc, char *argv[]) {
  int rc = 0;
  pthread_t thread[MAX_THREADS];
  printf("Creating threads ...\n");
  for (i = 0; i < MAX_THREADS && rc == 0; i++) {
    rc = pthread_create(&(thread[i]), NULL, (void *) &run, NULL);
    if (rc == 0) {
      pthread_detach(thread[i]);
      if ((i + 1) % 100 == 0)
    printf("%i threads so far ...\n", i + 1);
    }
    else
    {
      printf("Failed with return code %i creating thread %i (%s).\n",
         rc, i + 1, strerror(rc));

      // can we allocate memory?
      char *block = NULL;
      block = malloc(65545);
      if(block == NULL)
        printf("Malloc failed too :( \n");
      else
        printf("Malloc worked, hmmm\n");
    }
  }
sleep(60*60); // ctrl+c to exit; makes it easier to see mem use
  exit(0);
}

This is running on a 64bit machine with 32GB of RAM; Debian 5.0 installed, all stock.

  • ulimit -s 512 to keep the stack size down
  • /proc/sys/kernel/pid_max set to 1,000,000 (by default, it caps out at 32k pids).
  • ulimit -u 1000000 to increase max processes (don't think this matters at all)
  • /proc/sys/kernel/threads-max set to 1,000,000 (by default, it wasn't set at all)

Running this spits out the following:

65500 threads so far ...
Failed with return code 12 creating thread 65529 (Cannot allocate memory).
Malloc worked, hmmm

I'm certainly not running out of ram; I can even launch several more of these programs all running at the same time and they all start their 65k threads.

(Please refrain from suggesting I not try to launch 100,000+ threads. This is simple testing of something which should work. My current epoll-based server has roughly 200k+ connections at all times and various papers would suggest that threads just might be a better option. - Thanks :) )

rekamso
  • 196
  • 1
  • 7
  • `ulimit -s 512` actually sets the minimum stack size to 512 kilobytes, not 512 bytes. So with 100,000 threads that would be almost 50GB (however, this is likely not the problem, as the stacks are demand-allocated). – caf Aug 19 '10 at 12:02
  • Yes, I've tried setting it to simply ulimit -s 1 and the result of 65528 threads is the same. Same if I use ulimit -s 1024 for that matter. – rekamso Aug 19 '10 at 12:05
  • 1
    Can you confirm with strace (and patience) that the final pthread_create (clone(2)?) call actually fails with ENOMEM? What are the values of, and what happens if you increase `/proc/sys/` files: `vm/max_map_count`, `kernel/pid_max` and `kernel/threads-max`? – pilcrow Aug 19 '10 at 14:10
  • 3
    Your "various papers" link points to one paper, which basically says threads are great if you change the thread libraries out for a custom green threads implementation and maybe change the compiler, too. Your test code is using the stock compiler and OS threads, so I don't see why you use even that one paper to support your decision to try this. Additionally, that paper ignores the fact that threads are fundamentally nondeterministic. You should read this newer paper from someone else at Berkeley: http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf – Warren Young Aug 19 '10 at 14:30
  • Please put your solution in an answer, so we can vote it up :). – Andrew Aylett Aug 19 '10 at 15:34
  • This http://www.kegel.com/c10k.html suggests too many threads would slow things down because of the overhead of context switching. Just something to consider if you haven't read it. – Ioan Aug 20 '10 at 17:39
  • you might consider a hybrid approach, that many threads could induce a large context switch overhead problem as loan said, but also with the small stacks you're talking about it'll become VERY VERY easy to overrun the top of stack, it's a recipe for heisenbugs also according to the manpage for `pthread_attr_setstack()` the smallest stack size you can use is 16k. – Spudd86 Nov 22 '10 at 19:41
  • Threads are a better option, but a few dozen of them at most, not hundreds of thousands! You only need as many threads as things you need to do at once. – David Schwartz Aug 30 '11 at 13:14

4 Answers4

6

pilcrow's mention of /proc/sys/vm/max_map_count is right on track; raising this value allows more threads to be opened; not sure of the exact formula involved, but a 1mil+ value allows for some 300k+ threads.

(For anyone else experimenting with 100k+ threads, do look at pthread_create's mmap issues... making new threads gets really slow really fast when lower memory is used up.)

Chen A.
  • 10,140
  • 3
  • 42
  • 61
rekamso
  • 196
  • 1
  • 7
0

One possible issue is the local variable thread in the main program. I think that pthread_t would be 8 bytes on your 64-bit machine (assuming 64-bit build). That would be 800,000 bytes on the stack. Your stack limit of 512K would be a problem I think. 512K / 8 = 65536, which is suspiciously near the number of threads you are creating. You might try dynamically allocating that array instead of putting it on the stack.

Mark Wilkins
  • 40,729
  • 5
  • 57
  • 110
  • alternatively leave the stack size alone for the initial thread and change it only for later ones (ie use `pthread_attr_setstack()` to set the stack size for each thread you create) – Spudd86 Nov 22 '10 at 19:32
0

This might help set the stack size in the program to the smallest it can go (if that's not enough you pick):

/* compile with:   gcc -lpthread -o thread-limit thread-limit.c */
/* originally from: http://www.volano.com/linuxnotes.html */

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
#include <string.h>

#define MAX_THREADS 100000
int i;

void run(void) {
  sleep(60 * 60);
}

int main(int argc, char *argv[]) {
  int rc = 0;
  pthread_t thread[MAX_THREADS];
  pthread_attr_t thread_attr;

  pthread_attr_init(&thread_attr);
  pthread_attr_setstacksize(&thread_attr, PTHREAD_STACK_MIN);

  printf("Creating threads ...\n");
  for (i = 0; i < MAX_THREADS && rc == 0; i++) {
    rc = pthread_create(&(thread[i]), &thread_attr, (void *) &run, NULL);
    if (rc == 0) {
      pthread_detach(thread[i]);
      if ((i + 1) % 100 == 0)
    printf("%i threads so far ...\n", i + 1);
    }
    else
    {
      printf("Failed with return code %i creating thread %i (%s).\n",
         rc, i + 1, strerror(rc));

      // can we allocate memory?
      char *block = NULL;
      block = malloc(65545);
      if(block == NULL)
        printf("Malloc failed too :( \n");
      else
        printf("Malloc worked, hmmm\n");
    }
  }
sleep(60*60); // ctrl+c to exit; makes it easier to see mem use
  exit(0);
}

additionally you could add a call like this: pthread_attr_setguardsize(&thread_attr, 0); just after the call to pthread_attr_setstacksize() but then you'd loose stack overrun detection entirely, and it'd only save you 4k of address space and zero actual memory.

Spudd86
  • 2,986
  • 22
  • 20
0

Are you trying to search for a formula to calculate max threads possible per process?

Linux implements max number of threads per process indirectly!!

number of threads = total virtual memory / (stack size*1024*1024)

Thus, the number of threads per process can be increased by increasing total virtual memory or by decreasing stack size. But, decreasing stack size too much can lead to code failure due to stack overflow while max virtual memory is equals to the swap memory.

Check you machine:

Total Virtual Memory: ulimit -v (default is unlimited, thus you need to increase swap memory to increase this)

Total Stack Size: ulimit -s (default is 8Mb)

Command to increase these values:

ulimit -s newvalue

ulimit -v newvalue

*Replace new value with the value you want to put as limit.

References:

http://dustycodes.wordpress.com/2012/02/09/increasing-number-of-threads-per-process/

codersofthedark
  • 9,183
  • 8
  • 45
  • 70