How to trigger collection of Old Generation in G1GC

Question

Recently, some of our servers have been crashing due to segfaults. Although I don't have a proven root cause, I do have a hunch that it relates to how our application is garbage collected, the GC tuning we've done, and the memory profile.

Investigating multiple occurrences of these crashes, there is a pattern I've identified from the point of view of the JVM:

prior to crash, number of threads increases to an above-normal level
prior to crash, the generally normal sawtooth pattern of overall heap usage goes away and the heap size grows without decreasing
prior to crash, the heap's young generation is consistently low, and does not appear to resize or grow in usage
prior to crash, the old generation grows to a size greater than any past old gen sizes, and does not appear to be cleaned up or collected
the segfault always has to do with an active GC thread, specifically copy_to_survivor_space

While I don't see hard evidence of an out of memory occurrence, I'm of the opinion that we are indeed running out of heap space for the application. If the G1GC cannot copy young objects to survivor space prior to evacuation or promotion, it seems to logically follow that it did not have sufficient space to do so. Analyzing the GC logs, I don't see much of anything to do with Humongous objects, to I don't think they're taking up a bunch of space in the heap.

Looking at the memory profile, my hunch is that I should descrease InitiatingHeapOccupancyPercent to something closer to the default of 45 in order to trigger a collection cycle earlier. It seems to me, especially given the ever-growing size of the Old Gen, that a mixed/full GC needs to be triggered more often or at least earlier. How do I initiate a full/mixed collection?

Based on the information provided, are there other thoughts or opinions on how I can trigger collection sooner? Am I misinterpreting the segfault message and heading down the wrong path? What else can I do to gather information that might enable me to address the root cause of the crashes?

Detail
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f38aa2655f5, pid=6293, tid=0x00007f3894efe700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_162-b12) (build 1.8.0_162-b12)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.162-b12 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x5c85f5]  G1ParScanThreadState::copy_to_survivor_space(InCSetState, oopDesc*, markOopDesc*)+0x45
#

JVM Options:

-XX:MaxHeapSize=30g
-XX:MetaspaceSize=256m
-XX:MaxMetaspaceSize=512m
-XX:+UseG1GC
-XX:MaxGCPauseMillis=200
-XX:InitiatingHeapOccupancyPercent=70
-XX:-OmitStackTraceInFastThrow
-XX:+AlwaysPreTouch
-XX:+UseStringDeduplication
-XX:+UseCompressedOops

-Xloggc:/usr/local/company/logs/gc.log
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=10
-XX:GCLogFileSize=100M
-XX:+PrintAdaptiveSizePolicy
-XX:+PrintGCApplicationConcurrentTime
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCCause
-XX:+PrintGCDateStamps
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintHeapAtGC
-XX:+PrintReferenceGC
-XX:+PrintTenuringDistribution
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/usr/local/company/logs/heapdump_126960.hprof

score 1 · Answer 1 · answered Apr 19 '19 at 17:05

Am I misinterpreting the segfault message and heading down the wrong path?

Yes, heap-OOMs should never result in a segfault, instead they should only trigger out of memory errors through the exception/throwable mechanism. The crash signature points to either a JVM bug or heap corruption caused by external factors (native libraries loaded into the JVM process, memory corruption, incorrect use of Unsafe).

Try upgrading your JVM and see if the cause has already been fixed in newer versions. If that does not help try removing parts of your application, dependencies, java agents etc. or running on different hardware.

How to trigger collection of Old Generation in G1GC

1 Answers1