1

I have a situation here:

JDK version is showing below:

openjdk version "1.8.0_292"
OpenJDK Runtime Environment (build 1.8.0_292-b10)
OpenJDK 64-Bit Server VM (build 25.292-b10, mixed mode)

My application is running with default GC strategy with JVM.

Starting by command:

nohup java -jar -Xms512M -Xmx512M -Xmn300M -XX:SurvivorRatio=6 -XX:MetaspaceSize=256M -XX:MaxMetaspaceSize=256M -XX:+PrintGCDetails -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -Xloggc:/data/code/logs/application-gc.log -XX:+HeapDumpBeforeFullGC -XX:HeapDumpPath=/data/code/logs/test-dump -Dspring.profiles.active=test /data/code/application.jar >> /data/code/logs/application.log &

Old generation in JVM will always getting unexpected little objects promoted from young generation and keep old generation usage going up, and then cause a full GC each 4 hours during my test.

Here is the testing code:

    @PostMapping("/test/jvmPressure")
    public CommonResult jvmTest() {
        List<Role> roles = new ArrayList<>();
        for (long i = 0L; i < 10000L; i++) {
            Role role = new Role();
            role.setLevel(1);
            role.setParentId(1L);
            role.setStatus(true);
            role.setName("this is a test object");
            role.setRoleName("this is a test object");
            role.setCompanyId(123123);
            role.setId(i);
            roles.add(role);
        }
        return CommonResult.succeed();
    }

I just create a lot of objects and do nothing, to make eden area full and trigger YGC by continuing requesting this API.

Observing by Jmeter, all requests are finished in 15 ms, and I don't think any of the Role object will suvive to old generation.

But usage of old generation will going up anyway.

Is it possible to figure out what's new to old generation? I cannot find any unusual objects in dump file, and GC log is also quite normal. But still, I'm confused that which object is keep promoting into old generation...

According to my latest observation of Jmeter record and dump file.I have found another strange problem.

The max response time is 2250ms and objects created is 40Mb/s, which means, YGC will trigger every 7s because of the 300M eden space. I thought that many test objects will go into suvive area at the time of YGC triggered, but no test objects would survive more than 2 YGCs. Which means no test objects will go into old generation at all.

But the fact is, when I create dump files after 500 YGCs and find there are lots of unexpected test objects in heap. Before I dump heap, I have shut down Jmeter and waited for 2 YGCs to ensure there are no test objects in young generation, to see if there are test objects promoted to old generation.Unfortunately, it can be seen from this picture that some of the test objects were promoted into old generation in no reason(comparing with a dump file created by same method before).

enter image description here

Here is the GC log shortcut: enter image description here enter image description here enter image description here

william
  • 265
  • 4
  • 17
  • 1
    Do these full GCs still happen when you’re not monitoring your application? – Holger Oct 11 '22 at 06:07
  • @Holger Yes, full GC still happen if no Jmeter requests incomming and it might be caused by spring-boot's daemon threads or something else, but in this situation, the Full GC will be triggered onece per 48+ hours based on observation of dump file. – william Oct 11 '22 at 06:18
  • @Holger However, according to Jmeter's test report, the max response time is 2250ms, and YGC time interval observed by `jstat -gc 500 100000` is longer than 10s, which means testing objects will never survive more than 2 YGCs, but still, testing objects will get into old generation with no reason. I don't know why. – william Oct 11 '22 at 06:28
  • I assume that you're running the parallel GC? You might try increasing the size of the survivor space even more: https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/sizing.html – boneill Oct 11 '22 at 19:13
  • Test objects will be promoted to OldGeneration if they live long enough (survivors) or if code is allocating large blocks of memory (direct to old). Given @PostMapping and jmeter I am assuming that you are stress testing this code with multiple parallel request, and those are being handled by multiple threads. If you have large pool of threads handling request, purely statistically some threads are gonna stall just enough that it is possible for objects to survive enough to be promoted. I would start tuning by increasing eden to old gen ratio. Be aware this tuning is purely for bench score. – Talijanac Oct 13 '22 at 07:15
  • [Here](https://stackoverflow.com/a/36596077/3448419) is the tool to find what goes to Old generation. You may also try tuning `-XX:SurvivorRatio`, `-XX:NewSize` and disable `-XX:-UseAdaptiveSizePolicy` – apangin Oct 14 '22 at 20:53

1 Answers1

1

The problem was solved. And here is the cause: It is due to parameter -XX:SurvivorRatio=6,I thought it will give survivor area a rather large space to contain those test objects.

But the fact is: when the requests comming in, JVM still continue reducing the S area space every YGC ignoring -XX:SurvivorRatio=6, S area space was what I set at the first time and then JVM will dynamically change it into JVM's way.

My first setting of S area was over 20MB, and it is enough to contain all the survived test objects, but YVM will turn down S area space each YGC and finnaly stable at about 2-3MB.

So, by printing gclog with command -XX:+PrintAdaptiveSizePolicy -XX:+PrintTenuringDistribution and I can see some overflow log in the gc log, and some of the test objects were directly put into old generation due to the S area overflow.

Then I try to figure out why JVM will change my S area space,and find that only by setting -XX:-UseAdaptiveSizePolicy and -XX:SurvivorRatio=6 to shutdown JVM's dynamic adjestment of S area and then give it a large space to contain test objects, will solve this problem.

william
  • 265
  • 4
  • 17