3

I'm using G1 garbage collector for an application that should use a single CPU core (it's a Spark job running on Yarn), but probably because JVM sees all the available cores, it uses quite large number of parallel threads, from 18 to 23. Does it really matter? Should I set the number of parallel threads manually?

Eugene
  • 117,005
  • 15
  • 201
  • 306
synapse
  • 5,588
  • 6
  • 35
  • 65

1 Answers1

1

Here is a rather interesting observation, first (at least on jdk-15).

Suppose this code:

public class Sandbox {
      
    public static void main(String[] args) {

        while (true) {
           LockSupport.parkNanos(TimeUnit.MILLISECONDS.toNanos(10_000));
       }

    }
  
}

And I run it with: java -XX:ActiveProcessorCount=1 Sandbox.java. Connect to it via: jcmd <PID> VM.flags and notice a rather interesting thing: -XX:+UseSerialGC. Since you have specified a single CPU, there is no point in using the G1GC according to the JVM (which makes sense). So be prepared.


You can what you want via : -XX:ActiveProcessorCount= and this is how many CPUs your JVM will see (and most probably build heuristics internally based on that).

There are two types of threads that G1 uses : ParallelGCThreads and ConcGCThreads, the fulfill different purposes.

If I do the same exercise from the above (start a process and connect to it, but also enabled -XX+UseG1GC), I see that ParallelGCThreads is not changed (defaults to 10), neither is ConcGCThreads (which defaults to 3); which to me is a surprise. Or not. It depends on how you look at it - the VM tried to prohibit usage of G1GC to begin with.

Of course 10 parallel threads competing for 1 CPU isn't a great way to set-up your VM.

Eugene
  • 117,005
  • 15
  • 201
  • 306
  • Actually, G1 is used by accident cause it was mentioned in Spark optimization guide, the real problem was the size of young generation - for whatever reason the default GC wasn't increasing its size even though the program produces 300MB/s of garbage. – synapse Oct 23 '20 at 07:54
  • 2
    But SerialGC does not have string deduplication, has it? Further, unless it has been substantial rewritten, its rigid memory organization may lead to worse performance compared to modern algorithms. – Holger Oct 23 '20 at 09:11
  • @Holger I've tried SerialGC and the performance was terrible, I've asked a new question now that I've figured out what's going on https://stackoverflow.com/questions/64525636/best-settings-for-an-application-that-produces-lots-of-garbage – synapse Oct 25 '20 at 16:34