1

My application receives 10~ requests/milliseconds (10000 req/s), but sometimes I need to reload a file (30MB~) to the memory (hashMap) and after that the gc stops for 1s~ to collect the old hashMap. In this time, many requests received timeout because passed more than 1s waiting. I tried to increase the number of thread pools and tuning the XX:MaxGCPauseMillis. Is there any suggestion of what to do?

Infrastructure: Quarkus Reactive(Java 11) with G1GC JAVA_OPTIONS: -Xmx2g -Xms1.5g K8s pods 2gb/4gb in request/limits, 1c/2c request/limits

Implementation: each 1 hour the application reloads a csv file and recreates a singleton hashmap. The clients consume this hashmap at rate of 10 req/milliseconds.

update1: I tried to use ZGC instead G1 (-XX:+UnlockExperimentalVMOptions -XX:+UseZGC -XX:InitiatingHeapOccupancyPercent=70). My application reached the limit in CPU and memory (2c/4GB) and after sometime it fell due to OOM.

  • Add more RAM to the server. The garbage collector will only run when there is memory contention. – Elliott Frisch Mar 22 '21 at 22:24
  • 2
    Which GC are you using? Have you tried the Zero Pause collector (ZGC)? What are the options you start the JVM with? – markspace Mar 22 '21 at 22:28
  • @ElliottFrisch I configured 2GB in requests and 4GB in limits (k8s) but neither the memory nor cpu is bottlenecking. It's using 2.5GB and 900mc. – user2109429 Mar 22 '21 at 22:39
  • @markspace very interesting I will try it. -Xmx2g -Xms1.5g -XX:MaxGCPauseMillis=100. – user2109429 Mar 22 '21 at 22:44
  • 1
    If you're using G1GC, Oracle recommends different parameters than what you're using now: https://docs.oracle.com/cd/E40972_01/doc.70/e40973/cnf_jvmgc.htm – markspace Mar 22 '21 at 22:52
  • @markspace yes, I'm using G1. I forget to answer. thanks again. I will try to test with this recommendation and ZGC and give the results. – user2109429 Mar 22 '21 at 23:06
  • That page is for configuring the GC for the "Oracle Communications WebRTC Session Controller" ... whatever that is. It won't be valid for all applications. A more apropos article would be: https://www.oracle.com/technical-resources/articles/java/g1gc.html – Stephen C Mar 22 '21 at 23:20
  • One second is far too short for a request timeout. – user207421 Mar 22 '21 at 23:55
  • @markspace I tried use ZGC instead G1 (-XX:+UnlockExperimentalVMOptions -XX:+UseZGC -XX:InitiatingHeapOccupancyPercent=70). My application reached the limit in CPU and memory and after sometime it fell due to OOM. – user2109429 Mar 23 '21 at 23:13
  • @BasilBourque I will update the description thanks – user2109429 Mar 23 '21 at 23:13
  • 1
    It’s hard to believe that the GC stops for an entire second for a heap that has a maximum of 2GB. This would indicate an excessive use of `LinkedList` or similarly GC unfriendly objects. But your question doesn’t tell what kind of objects you are producing. We only get to know that the source is a 30MB CSV file, which is irrelevant to the garbage collector. – Holger Mar 24 '21 at 10:33
  • 2
    @markspace I have my doubts whether recommendations including the `XX:PermSize` option, which has no relevance since Java 8, are suitable for a Java 11 environment. On the other hand, recommending some fixed numbers of GC threads without knowing the number of CPU cores on the actual machine, is questionable in general. Even when it comes from Oracle. – Holger Mar 24 '21 at 10:51
  • thanks for the all answers. I fixed the problem updating the Quarkus and increasing the number of pods to reduce the load – user2109429 Mar 26 '21 at 17:19

0 Answers0