3

Application becomes unresponsive on high load while G1 garbage collector is in the Remark stage(Finalize marking). This is part of the gc log (pause happened at 2015-08-30T11:24:23.780):

2015-08-30T11:24:22.736-0400: 431394.465: [GC pause (GCLocker Initiated GC) (young) (initial-mark), 0.2050778 secs]
[Parallel Time: 156.7 ms, GC Workers: 23]
[GC Worker Start (ms): Min: 431394466.6, Avg: 431394466.8, Max: 431394466.9, Diff: 0.3]
  [Ext Root Scanning (ms): Min: 3.9, Avg: 4.3, Max: 8.5, Diff: 4.5, Sum: 99.2]
  [Update RS (ms): Min: 15.7, Avg: 20.0, Max: 20.5, Diff: 4.8, Sum: 460.5]
     [Processed Buffers: Min: 21, Avg: 31.3, Max: 54, Diff: 33, Sum: 719]
  [Scan RS (ms): Min: 56.4, Avg: 57.2, Max: 58.2, Diff: 1.8, Sum: 1314.9]
  [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
  [Object Copy (ms): Min: 72.9, Avg: 73.8, Max: 74.5, Diff: 1.6, Sum: 1698.3]
  [Termination (ms): Min: 0.4, Avg: 0.8, Max: 0.9, Diff: 0.5, Sum: 17.8]
     [Termination Attempts: Min: 1, Avg: 186.9, Max: 214, Diff: 213, Sum: 4298]
  [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.3]
  [GC Worker Total (ms): Min: 156.0, Avg: 156.2, Max: 156.3, Diff: 0.3, Sum: 3592.2]
  [GC Worker End (ms): Min: 431394622.9, Avg: 431394622.9, Max: 431394623.0, Diff: 0.1]
[Code Root Fixup: 0.1 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 6.4 ms]
[Other: 41.8 ms]
  [Choose CSet: 0.0 ms]
  [Ref Proc: 27.2 ms]
  [Ref Enq: 1.2 ms]
  [Redirty Cards: 2.3 ms]
  [Humongous Register: 0.1 ms]
  [Humongous Reclaim: 0.1 ms]
  [Free CSet: 8.5 ms]
[Eden: 11.4G(11.4G)->0.0B(10.4G) Survivors: 736.0M->800.0M Heap: 56.6G(100.0G)->45.3G(100.0G)]
[Times: user=3.69 sys=0.08, real=0.21 secs] 
2015-08-30T11:24:22.951-0400: 431394.679: [GC concurrent-root-region-scan-start]
2015-08-30T11:24:23.101-0400: 431394.829: [GC concurrent-root-region-scan-end, 0.1499777 secs]
2015-08-30T11:24:23.101-0400: 431394.829: [GC concurrent-mark-start]
2015-08-30T11:24:23.779-0400: 431395.508: [GC concurrent-mark-end, 0.6781424 secs]
2015-08-30T11:24:23.780-0400: 431395.509: [GC remark 2015-08-30T11:24:23.780-0400: 431395.509: [Finalize Marking, 128.7675820 secs] 2015-08-30T11:26:32.548-0400: 431524.277: [GC ref-proc, 0.0246842 secs] 2015-08-30T11:26:32.573-0400: 431524.301: [Unloading, 0.0113214 secs], 128.8485361 secs]
[Times: user=2182.97 sys=753.23, real=128.83 secs] 
2015-08-30T11:26:32.631-0400: 431524.360: [GC cleanup 45G->45G(100G), 0.1537170 secs]
[Times: user=3.46 sys=0.00, real=0.16 secs] 
2015-08-30T11:26:32.785-0400: 431524.514: [GC concurrent-cleanup-start]
2015-08-30T11:26:32.785-0400: 431524.514: [GC concurrent-cleanup-end, 0.0000678 secs] 

Server has 128GB of RAM

GC-related JVM options:

-XX:InitialHeapSize=107374182400 -XX:+ManagementServer -XX:MaxHeapSize=107374182400 -XX:-OmitStackTraceInFastThrow -XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails
-XX:+PrintGCTimeStamps -XX:+TieredCompilation -XX:+UseCondCardMark -XX:+UseG1GC

Java version:

Java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)

Any ideas what can be the reason?

UPDATE: I tried to set ParallelGCThreads=32 and ConcGCThreads=8. Pause duration was 82 seconds. It became shorter but still too long

 {Heap before GC invocations=1558 (full 0):
     garbage-first heap   total 94371840K, used 83001550K [0x00007f76d0000000, 0x00007f76d2005a00, 0x00007f8d50000000)
      region size 32768K, 1234 young (40435712K), 58 survivors (1900544K)
     Metaspace       used 77160K, capacity 77951K, committed 78080K, reserved 79872K
     204611.028: [G1Ergonomics (Concurrent Cycles) initiate concurrent cycle, reason: concurrent cycle initiation requested]
    2015-10-04T15:00:05.197-0400: 204611.028: [GC pause (GCLocker Initiated GC) (young) (initial-mark)
    Desired survivor size 2600468480 bytes, new threshold 2 (max 2)
    - age   1:  621290376 bytes,  621290376 total
    - age   2:  245159208 bytes,  866449584 total
     204611.029: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 184157, predicted base time: 57.32 ms, remaining time: 142.68 ms, target pause time: 200.00 ms]
     204611.029: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 1176 regions, survivors: 58 regions, predicted young region time: 131.19 ms]
     204611.029: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 1176 regions, survivors: 58 regions, old: 0 regions, predicted pause time: 188.51 ms, target pause time: 200.00 ms]
    2015-10-04T15:00:05.416-0400: 204611.247: [SoftReference, 0 refs, 0.0023502 secs]2015-10-04T15:00:05.418-0400: 204611.249: [WeakReference, 394 refs, 0.0018755 secs]2015-10-04T15:00:05.420-0400: 204611.251: [FinalReference, 91704 refs, 0.
    0081225 secs]2015-10-04T15:00:05.428-0400: 204611.259: [PhantomReference, 29 refs, 1 refs, 0.0048953 secs]2015-10-04T15:00:05.433-0400: 204611.264: [JNI Weak Reference, 0.0000229 secs], 0.2600007 secs]
       [Parallel Time: 215.1 ms, GC Workers: 32]
          [GC Worker Start (ms): Min: 204611030.3, Avg: 204611030.5, Max: 204611030.6, Diff: 0.4]
          [Ext Root Scanning (ms): Min: 2.5, Avg: 3.2, Max: 8.3, Diff: 5.8, Sum: 103.2]
          [Update RS (ms): Min: 19.0, Avg: 24.5, Max: 35.3, Diff: 16.3, Sum: 782.9]
             [Processed Buffers: Min: 7, Avg: 23.7, Max: 31, Diff: 24, Sum: 758]
          [Scan RS (ms): Min: 40.4, Avg: 42.2, Max: 43.0, Diff: 2.6, Sum: 1349.0]
          [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
          [Object Copy (ms): Min: 131.5, Avg: 140.0, Max: 142.4, Diff: 10.9, Sum: 4480.1]
          [Termination (ms): Min: 1.9, Avg: 4.1, Max: 4.6, Diff: 2.7, Sum: 130.8]
             [Termination Attempts: Min: 262, Avg: 544.9, Max: 599, Diff: 337, Sum: 17437]
          [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.4, Sum: 7.1]
          [GC Worker Total (ms): Min: 213.8, Avg: 214.2, Max: 214.5, Diff: 0.7, Sum: 6853.3]
          [GC Worker End (ms): Min: 204611244.4, Avg: 204611244.6, Max: 204611244.8, Diff: 0.4]
       [Code Root Fixup: 0.0 ms]
       [Code Root Purge: 0.0 ms]
       [Clear CT: 8.4 ms]
       [Other: 36.4 ms]
          [Choose CSet: 0.1 ms]
          [Ref Proc: 19.1 ms]
          [Ref Enq: 0.8 ms]
          [Redirty Cards: 2.0 ms]
          [Humongous Register: 0.0 ms]
          [Humongous Reclaim: 0.0 ms]
          [Free CSet: 11.5 ms]
       [Eden: 36.8G(36.7G)->0.0B(35.8G) Survivors: 1856.0M->1568.0M Heap: 79.2G(90.0G)->42.4G(90.0G)]
    Heap after GC invocations=1559 (full 0):
     garbage-first heap   total 94371840K, used 44456740K [0x00007f76d0000000, 0x00007f76d2005a00, 0x00007f8d50000000)
      region size 32768K, 49 young (1605632K), 49 survivors (1605632K)
     Metaspace       used 77160K, capacity 77951K, committed 78080K, reserved 79872K
    }
     [Times: user=7.11 sys=0.18, real=0.26 secs]
    2015-10-04T15:00:05.457-0400: 204611.288: [GC concurrent-root-region-scan-start]

    2015-10-04T15:00:05.457-0400: 204611.288: Total time for which application threads were stopped: 0.2607079 seconds, Stopping threads took: 0.0000623 seconds

    2015-10-04T15:00:05.593-0400: 204611.424: [GC concurrent-root-region-scan-end, 0.1359037 secs]

    2015-10-04T15:00:05.593-0400: 204611.424: [GC concurrent-mark-start]

    2015-10-04T15:00:06.238-0400: 204612.069: [GC concurrent-mark-end, 0.6448793 secs]

    2015-10-04T15:00:06.238-0400: 204612.069: [GC remark 2015-10-04T15:00:06.238-0400: 204612.069: [Finalize Marking, 82.5606577 secs] 2015-10-04T15:01:28.799-0400: 204694.630: [GC ref-proc2015-10-04T15:01:28.799-0400: 204694.630: [SoftRefer
    ence, 64 refs, 0.0031041 secs]2015-10-04T15:01:28.802-0400: 204694.633: [WeakReference, 383 refs, 0.0023197 secs]2015-10-04T15:01:28.805-0400: 204694.636: [FinalReference, 100288 refs, 0.0330312 secs]2015-10-04T15:01:28.838-0400: 204694.
    669: [PhantomReference, 2 refs, 37 refs, 0.0030514 secs]2015-10-04T15:01:28.841-0400: 204694.672: [JNI Weak Reference, 0.0000915 secs], 0.0539037 secs] 2015-10-04T15:01:28.853-0400: 204694.684: [Unloading, 0.0130054 secs], 82.6649201 sec
    s]
     [Times: user=1139.19 sys=393.93, real=82.65 secs]

    2015-10-04T15:01:28.903-0400: 204694.735: Total time for which application threads were stopped: 82.6655358 seconds, Stopping threads took: 0.0000918 seconds

    2015-10-04T15:01:28.904-0400: 204694.735: [GC cleanup 42G->42G(90G), 0.1166196 secs]
  • remark -> SATB processing -> SATBs are written by mutator threads. So maybe the concurrent phase can't keep up with mutators. How many cores/threads? How many dedicated to GC (check with `-XX:+PrintFlagsFinal` and look for `ConcGCThreads`. Also, are your allocation rates so high that a 10G young gen is necessary? – the8472 Sep 11 '15 at 15:04
  • Sorry for late response. There are 32 cores(with HyperThreading). -XX:+PrintFlagsFinal shows 6 for ConcGCThreads – Ivan Shramko Oct 01 '15 at 16:34

1 Answers1

1

AIUI the Finalize Marking part of G1's mostly consists of processing SATB buffers that couldn't be handled during the concurrent marking phase.

So my conclusion is that concurrent-mark does not process SATB buffers fast enough to keep up with the mutator threads generating them.

So what you can try is increasing the ConcGCThreads count. Note that those threads will compute for CPU time with your application threads.

the8472
  • 40,999
  • 5
  • 70
  • 122
  • Thanks for the response. Do you have any idea how many threads should I give for GC? – Ivan Shramko Oct 02 '15 at 08:58
  • no, just try some numbers and measure and see how much of a difference it makes. – the8472 Oct 02 '15 at 17:24
  • I tried to increase threads and it didn't help. I updated the original post – Ivan Shramko Oct 06 '15 at 13:34
  • another sanity check is to verify that no swapping happens, although it would be strange if it only showed up in remark pauses and nowhere else. – the8472 Oct 06 '15 at 21:28
  • It didn't help. Pauses probably became shorter because load on server decreased. There were two pauses again with duration about 80 seconds – Ivan Shramko Oct 12 '15 at 09:18
  • Hrrm, i'm seeing `[FinalReference, 100288 refs, 0.0330312 secs]`, that's a lot of references here even though the processing itself seems fast. See if `+ParallelRefProcEnabled` makes any difference. And maybe track down what's using those finalizers. – the8472 Oct 12 '15 at 09:41
  • I wrote to the maillist you told me about - hotspot-gc-use. I've got response there that transparent_huge_pages on linux can be the problem. I have disabled it and will see if it helps. – Ivan Shramko Oct 13 '15 at 10:46
  • ah, interesting. if that doesn't help either i'm out of ideas. – the8472 Oct 13 '15 at 10:51