2

On a linux server with 6 java processes running tc-server with different web-applications, sometimes several servers stop working (almost at the same time) because of this error:

Exception in thread "ajp-bio-9096-Acceptor-0" java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1371)
at org.apache.tomcat.util.threads.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:162)
at org.apache.tomcat.util.threads.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:142)
at org.apache.catalina.core.StandardThreadExecutor.execute(StandardThreadExecutor.java:169)
at org.apache.tomcat.util.net.JIoEndpoint.processSocket(JIoEndpoint.java:531)
at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:234)
at java.lang.Thread.run(Thread.java:745)

This causes the one or more JVM's that are running on 1 machine to stop working. (we have 6 tc-servers running on the same machine) It looks like the maximum number of processes on our linux machine is reached. The max threads on OS-level is about 31389 (ulimit -u). But we do not see an excessive number of threads in linux When I look at the Operating System there are about 1500 threads running

(ps -eLF | wc -l)

When profiling/monitoring the tcserver processes the number of threads per java process is normally between 50 and 150 and under load that can go to 350 because of http-threads, but that will go down and never reaches the max number of threads per process of 1000 (on this machine).

We are using a 64 bit Java 1.7 runtime environment and there is always free memory on the OS.

Today we even got this message 1 minute after server start. The jvm stopped working, but the os process kept running. When looking at the number of threads of this process it was 51.

(ps uH p <pid> | wc -l).

So it looks like a maximum number of threads is reached, but we never see a lot of running threads.

SkyWalker
  • 28,384
  • 14
  • 74
  • 132
Edwin
  • 2,671
  • 2
  • 19
  • 24
  • 1
    Mind grabbing a capture of `top` so we can see how much memory is being used? – Johnny V Aug 14 '16 at 17:20
  • That shows:Tasks: 364 total, 1 running, 363 sleeping, 0 stopped, 0 zombie Cpu(s): 9.7%us, 5.9%sy, 0.0%ni, 83.4%id, 0.2%wa, 0.0%hi, 0.8%si, 0.0%st Mem: 8057304k total, 7778672k used, 278632k free, 476752k buffers Swap: 1048572k total, 0k used, 1048572k free, 1301448k cached – Edwin Aug 14 '16 at 17:49
  • when looking at a graph of memory usage in time we always see a small part free memory in both real memory and in swap. – Edwin Aug 14 '16 at 17:50
  • Yeah having 278MB free RAM is a problem. I would have liked to see the entire TOP output that included how much memory your java processes were using but I'm guessing they are using everything. – Johnny V Aug 14 '16 at 18:09
  • Remember a thread doesn't need to be running to count towards the limit also I believe that processes themselves count towards the thread limit. – Johnny V Aug 14 '16 at 18:10
  • Also, this isn't on AWS or a cloud service? – Johnny V Aug 15 '16 at 01:04
  • https://forums.aws.amazon.com/thread.jspa?threadID=86751 – Johnny V Aug 15 '16 at 02:07
  • it's a dedicated virtual machine, but it turned out that the I was misleaded about the ulimit -u value. That turned out to be too low for the user running the java runtime enviroments. – Edwin Aug 15 '16 at 20:56

1 Answers1

4

I found the the cause of the problem.

I checked the ulimit -u with my own user. That retuns 31389. And I could not see any reason why this limit was reached.

But on production these processes run under another user, and for that user ulimit -u returns 1024

These 6 servers have in default state 50-150 threads per server, so when there is temporary a bit more load the number of threads in tcserver will reach the limit of 1024.

So in linux we increased the number of threads for the production user and now it runs fine.

Edwin
  • 2,671
  • 2
  • 19
  • 24