7

I am looking for the appropriate settings to configure the JVM for a web application. I have read about old/young/perm generation, but I have trouble using those parameters at best for this configuration.

Out of the 4 GB, around 3 GB are used for a cache (applicative cache using EhCache), so I'm looking for the best set up considering that. FYI, the cache is static during the lifetime of the application (loaded from disk, never expires), but heavily used.

I have profiled my application already, and I have performed optimization regarding the DB queries, the application's architecture, the cache size, etc... I am just looking for JVM configuration advices here. I have measured 99% throughput for the Garbage Collector, and 6-8s pauses when the Full GC runs (approximately once every 1/2h).

Here are the current JVM parameters:

-XX:+UseParallelGC -XX:+AggressiveHeap -Xms2048m -Xmx4096m
-XX:NewSize=64m -XX:PermSize=64m -XX:MaxPermSize=512m
-verbose:gc -XX:+PrintGCDetails -Xloggc:gc.log

Those parameters may be completely off because they have been written a long time ago... Before the application became that big.

I am using Java 1.5 64 bits.

Do you see any possible improvements?

Edit: the machine has 4 cores.

Matthieu Napoli
  • 48,448
  • 45
  • 173
  • 261

3 Answers3

5

-XX:+UseParallel*Old*GC should speed up the Full GCs on a multi core machine.

You could also profile with different NewRatio values. Your cached objects will live in the tenured generation so profile it with -XX:NewRatio=7 and then again with some higher and lower values.

You may not be able to accurately replicate realistic use during profiling, so make sure you monitor GC when it is in real life use and then you can make minor changes (e.g. to survivor space etc) and see what effect they have.

Old advice was not to use AggressiveHeap with Xms and Xmx, I am not sure if that is still true.

Edit: Please let us know which OS/hardware platform you are deployed on.

Full collections every 30 mins indicates the old generation is quite full. A high value for newRatio will give it more space at the expense of the young gen. Can you give the JVM more than 4g or are you limited to that?

It would also be useful to know what your goals / non functional requirements are. Do you want to avoid these 6 / 7 second pauses at the risk of lower throughput or are those pauses an acceptable compromise for highest possible throughput?

If you want to minimise the pauses, try the CMS collector by removing both

-XX:+UseParallelGC -XX:+UseParallelOldGC 

and adding

-XX:+UseConcMarkSweepGC -XX:+UseParNewGC

Profile with that with various NewRatio values and see how you get on.

One downside of the CMS collector is that unlike the parallel old and serial collectors, it doesn't compact the old generation. If the old generation gets too fragmented and a minor collection needs to promote a lot of objects to the old gen at once, a full serial collection may be invoked which could mean a long pause. (I've seen this once in prod but with the IBM JVM which went out of memory instead of invoking a compacting collection!)

This might not be a problem for you - it depends on the nature of the application - but you can insure against it by restarting nightly or weekly.

Paul Medcraft
  • 1,386
  • 11
  • 23
  • Just in case its not clear, UseParallelOldGC is different to UseParallelGC. If you use UseParallelOldGC then UseParallelGC is also turned on, so you don't need both. – Paul Medcraft Jan 10 '12 at 13:59
  • I will try UseParallelOldGC and NewRatio ASAP, thanks. If anyone knows about AggressiveHeap with Xms and Xmx let me know. – Matthieu Napoli Jan 10 '12 at 14:04
  • UseParallelOldGC wasn't effective, I got 40s Full GC instead of 7s :D. Very strange (I removed UseParallelGC as you said) – Matthieu Napoli Jan 10 '12 at 15:48
  • Try it without aggressiveheap and without NewSize. If you know you are going to use 3GB for the cache, also set Xms to 4096m. If you don't do that, it will Full GC every time it needs to increase the heap. – Paul Medcraft Jan 10 '12 at 16:32
  • Yes I've already removed AggressiveHeap and set Xms to 4096m. I've seen the gc.log with GCViewer and the startup is better (no full gc) – Matthieu Napoli Jan 10 '12 at 16:39
  • Are there any other heavy processes running on the machine? If so you might need to tell it how many cores are available for Java: -XX: ParallelGCThreads=2. By default it will try to use them all on a 4-core machine. What OS and hardware is this running on? – Paul Medcraft Jan 10 '12 at 19:28
  • No other heavy processes. There are 4 cores. I tried the `NewRatio` and it's working amazingly well! I'm trying all those parameters but the NewRatio is very effective. – Matthieu Napoli Jan 11 '12 at 13:41
4

I would use Java 6 update 30 or 7 update 2, 64-bit as they are much more efficient. e.g. they use 32-bit references by default.

I would also configure Ehcache to use direct memory or a memory mapped file if possible. This should minimise the impact on GC.

Using these options its possible to almost eliminate your heap foot print. e.g. I have an app which uses up to 180 GB of memory mapped files on a machine with 16 GB of memory and the heap size is 6 MB. A full GC takes up to 11 ms when trigger manually, not that it ever GCs. ;)

If you want a simple example where I map in an 8 TB file into memory and update it. http://vanillajava.blogspot.com/2011/12/using-memory-mapped-file-for-huge.html

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • Switching to Java 6 is a good idea, however it's a major change and it's not my decision unfortunately :( (company policy) – Matthieu Napoli Jan 10 '12 at 13:57
  • Regarding configuring EhCache, this has been tested but because of the application's architecture, the cache has to be in memory (not on disk) else the access times are too long (too many accesses). Optimization regarding that is in progress. I can't unfortunately use EhCache BigMemory (off-heap memory) because it's not free (I'm not the one making the choice on this). To summarize: I'd like to tune my JVM configuration considering the cache is in the JVM (no change to the application's architecture for now). – Matthieu Napoli Jan 10 '12 at 13:59
  • Accessing a record in a memory mapped file can be 50 - 200 nano-seconds. (If its in OS disk cache memory) Which is not as fast as accessing fields in a objects but pretty quick. I don't use Ehcache, I just use memory mapped files. – Peter Lawrey Jan 10 '12 at 14:13
  • Yes but we use a network filesystem, so it's not the same performances. We are considering Memcached for example, but it's work in progress so I try to optimize the actual configuration. – Matthieu Napoli Jan 10 '12 at 14:28
  • It is true that network disks are unlikely to perform as well as local disks. You could use direct memory instead, assuming you boot of local disks. – Peter Lawrey Jan 10 '12 at 14:50
0

I hope you just removed -server to not inflate the post, otherwise you should instantly enable it. Apart from the bit longer startup time (which really isn't an issue for a web application that should run days) I don't see any reason to use anything but c2. That could give some nice performance improvements in general. Umn back to topic:

Sadly the best thing I can think of won't work with your ancient JVM. The G1 garbage collector was basically designed to reduce latency. Not only does it try to reduce pauses in general, it also offers some tuning parameters to set pause goals and intervals. See this page.

There is an experimental backport to java6 though I doubt it's kept up to date. And nobody is wasting any time on optimizing GC or anything else for Java 1.5 anymore I fear.

PS: There would also be IBM's JVM and obviously azul systems (ok that wasn't a serious proposition ;) ), but those are obviously out of the question.. just wanted to mention them.

Voo
  • 29,040
  • 11
  • 82
  • 156
  • 2
    Unless it's running on Windows, won't the JVM default to server mode on a 4-core machine with >2GB RAM? – Paul Medcraft Jan 10 '12 at 20:44
  • @Paul No idea, I'm not aware of any such optimizations, but I don't know everything about Hotspot. Still I wouldn't risk such an important flag on a conditional default value. – Voo Jan 10 '12 at 22:33
  • @Paul Interesting tidbit thanks. So c1 is basically completely unused these days (since hotspot defaults to c2 for every 64bit cpu) - good change really. – Voo Jan 11 '12 at 10:22
  • @PaulMedcraft for info, yes, the JVM auto-detect the "server class" of the current machine and change the JVM options accordingly. This is called "ergnomics settings" http://java.sun.com/performance/reference/whitepapers/tuning.html#section4.1.1 So no need to put -server – Offirmo Jul 09 '12 at 09:49