0

Having some major issues with our Tomcat 7 installation. It used to run fine for over a year, but now the CPU is mainly at 400% - 700% (8 core machine) and in the logs I see tones of these:

2015-04-01T23:05:13.046+0200: 5177.229: [GC2015-04-01T23:05:13.046+0200:
5177.229: [ParNew: 1398632K->345K(1747648K), 0.0062240 secs]
3439622K->2041335K(5941952K) icms_dc=0 , 0.0063310 secs] [Times: user=0.04
sys=0.00, real=0.00 secs]

My setting for Tomcat startup (in setenv.sh) are:

CATALINA_OPTS="$CATALINA_OPTS -server -Xms6G -Xmx6G -XX:+UseConcMarkSweepGC
-XX:+CMSIncrementalMode -XX:NewRatio=2 -XX:SurvivorRatio=4
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
-Djava.awt.headless=true
-Dorg.apache.jasper.runtime.BodyContentImpl.LIMIT_BUFFER=true
-Dmail.mime.decodeparameters=true -Djava.security.egd=file:/dev/./urandom"

The behavior is usually as such that the machine runs for an hour or so, I see it consuming memory and then it starts trying to GC and I get those log entries every second.

I've tried with different GC's setting and have had no success. As this just happened recently I wonder what could be the cause of this or if anyone has a solution here?

My server has 32GB RAM with Ubuntu LTS 14.04.2 and Oracle Java 7.

030
  • 5,901
  • 13
  • 68
  • 110
Nitai
  • 111
  • 1
  • 4
  • 3
    You are seeing those entries in your log because of `-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps`, what you need to find out is why the JVM needs to garbage collect all the time. I'd say you need to attach a profiler to your JVM and take a look at the heap, thread dumps, etc... It's completely impossible to know why that happens without knowing what is deployed in your Tomcat and how it behaves under load. You might just have a huge memory leak somewhere. – dawud Apr 01 '15 at 21:36
  • `I've tried with different GC's setting and have had no success` Ok, what did you try, why and what did you expect from every attempt? – 030 Apr 03 '15 at 21:09

1 Answers1

1

You have GC logging turned on, and it appears GC is working well.

 [ParNew: 1398632K->345K(1747648K), 0.0062240 secs]

Indicates that about 1.2 GB of new objects were cleaned in less than 1/100th of a second. If this is occurring once every second, an application is creating a lot of short lived objects. Check the time stamps and see how often this is occurring. I generally aim for several seconds (10 or more between new generation GC cycles. You didn't provide enough data to do much of an analysis. It is common for concurrent GC cycles to occur every hour or so. These have a different set of log messages.

It is often helpful to specify a separate log file for GC data. This makes it easier to review. You can specify that this log be rotated after it reaches a certain size. Try these options -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=3 -XX:GCLogFileSize=100M and if you don' have a separate log add -Xloggc:logs/gc.log.

With that large a new generation, concurrent GCs may take a long time to run. Adding -XX:+ScavengeBeforeFullGC will cause the new generation to be cleaned at the start and should improve GC tuning.

Review the documentation on garbage collection to better understand what is happening.

BillThor
  • 27,737
  • 3
  • 37
  • 69
  • Thank you this is very helpful. I'm aware that there is more to it than simply tuning a parameter here and there. It is just that we use memcached, web apps, Lucene, etc. all in one app. Since this is a live production server, I can't just turn certain apps off. So, I will look into your suggestions. – Nitai Apr 02 '15 at 23:12