3

Cross Posting from openJDK mailing list.

Why was Full GC (Allocation Failure) or for that matter a GC being triggered even when Eden Space was (almost) empty. I can see Metaspace being full, but then a GC due to that should have had Full GC (Metadata GC Threshold) and NOT Allocation Failure as the reason.

This went on for roughly 9hrs. Any thoughts?

2017-10-28T20:30:32.207+0000: 382100.678: [Full GC (Allocation Failure) [PSYoungGen: 0K->0K(253952K)] [ParOldGen: 146546K->146546K(348160K)] 146546K->146546K(602112K), [Metaspace: 45647K->45647K(1091584K)], 0.1865769 secs] [Times: user=0.87 sys=0.00, real=0.19 secs]
2017-10-28T20:30:32.401+0000: 382100.872: [Full GC (Allocation Failure) [PSYoungGen: 160K->0K(253952K)] [ParOldGen: 146546K->146547K(348160K)] 146706K->146547K(602112K), [Metaspace: 45647K->45647K(1091584K)], 0.1871330 secs] [Times: user=0.88 sys=0.00, real=0.18 secs]
2017-10-28T20:30:32.592+0000: 382101.064: [Full GC (Allocation Failure) [PSYoungGen: 0K->0K(253952K)] [ParOldGen: 146547K->146547K(348160K)] 146547K->146547K(602112K), [Metaspace: 45647K->45647K(1091584K)], 0.1852237 secs] [Times: user=0.88 sys=0.01, real=0.18 secs]
2017-10-28T20:30:32.786+0000: 382101.257: [Full GC (Allocation Failure) [PSYoungGen: 192K->0K(253952K)] [ParOldGen: 146547K->146558K(348160K)] 146739K->146558K(602112K), [Metaspace: 45647K->45647K(1091584K)], 0.1879624 secs] [Times: user=0.89 sys=0.00, real=0.19 secs]
2017-10-28T20:30:32.979+0000: 382101.450: [Full GC (Allocation Failure) [PSYoungGen: 0K->0K(253952K)] [ParOldGen: 146558K->146552K(348160K)] 146558K->146552K(602112K), [Metaspace: 45647K->45647K(1091584K)], 0.2331467 secs] [Times: user=1.15 sys=0.00, real=0.24 secs]

Exception in std_out:

java.lang.OutOfMemoryError: Java heap space
        at java.nio.HeapByteBuffer.<init>(Unknown Source) ~[?:1.8.0_121]
        at java.nio.ByteBuffer.allocate(Unknown Source) ~[?:1.8.0_121]

Similar stack trace is also observed for a plain object allocation instructions.

System Details:

  • Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8)

  • OS: Red Hat Enterprise Linux Server release 7.3 (Maipo)

  • This application was using AnonHugePages (THP)
rohitmohta
  • 1,001
  • 13
  • 22
  • 1
    Possibly you are allocating / trying to allocate *really large objects*; i.e. objects that are larger than the max threshold for allocating then in Eden space.. – Stephen C Oct 31 '17 at 14:55
  • @StephenC This process was working fine for 50hrs (with good load) and then all of a sudden it started failing. Max size of a single object that's being allocation by this JVM is around 1MB. – rohitmohta Oct 31 '17 at 15:18
  • `java.nio.ByteBuffer.allocate` instrument the callsite of that to see how many bytes it's trying to allocate. – the8472 Oct 31 '17 at 16:12
  • @the8472 we can see that for normal object allocation too I.e not ByteBuffer. Plain *new Class()* – rohitmohta Oct 31 '17 at 16:45
  • Could it be that your OS is actually out of memory? – Didier L Oct 31 '17 at 18:03
  • @DidierL Per our system monitoring, we had 10GB+ free memory at that point. – rohitmohta Oct 31 '17 at 20:12
  • It could be that a cosmic ray has changed some bit in your system's memory. This *smells* like a JVM bug, but unless you can reproduce it or provide more relevant info (like a full crash dump), it would be impossible to diagnose. – Stephen C Oct 31 '17 at 22:35
  • @StephenC lol.. current theory we’re working with is to blame Transparent Huge Pages (THP). – rohitmohta Oct 31 '17 at 23:34
  • Do you have any real evidence to support that theory? – Stephen C Nov 01 '17 at 00:37
  • @StephenC Nothing re-producible as of now. Will share if we can re-produce something with certainty. – rohitmohta Nov 01 '17 at 14:10

0 Answers0