I am running a build system. We used to use CMS collector, but we started suffering under very long full GC cycles, throughput (time not doing GC) was around 90%. So I now decided to switch to G1 with the assumtion that even if I have longer overall GC time, the pauses will be shorter hence ensuring higher availability. So this idea seemed to work even better than I expeced, I was seeing no full GC for almost 3 days, throughput was 97%, overall GC performance was way better. (All screenshots and data got from GCViewer)
Until now (day 6). Today the system simply went berzerk. Old space utilized is just barely under 100%. I am seeing Full GC triggered almost every 2-3 minutes or so:
Old space utilization:
Heap size is 20G (128G Ram total). The flags I am currently using are:
-XX:+UseG1GC
-XX:MaxPermSize=512m
-XX:MaxGCPauseMillis=800
-XX:GCPauseIntervalMillis=8000
-XX:NewRatio=4
-XX:PermSize=256m
-XX:InitiatingHeapOccupancyPercent=35
-XX:+ParallelRefProcEnabled
plus logging flags. What I seem to be missing is -XX:+ParallelGCThreads=20
(I have 32 processors), default should be 8. I have also read from oracle that it would be suggested to have -XX:+G1NewSizePercent=4
for 20G heap, default should be 5.
I am using Java HotSpot(TM) 64-Bit Server VM 1.7.0_76, Oracle Corporation
What would you suggest? Do I have obvious mistakes? What to change? Am I do greedy by giving Java only 20G? The assumption here is that giving it too much heap would mean longer GC as there is simply more to clean (peasant logic).
PS: Application is not mine. For me its a box-product.