Why Spark application runs much slower with lower MaxGCPauseMillis?

Question

I am testing Spark-1.5.1 with different G1 configurations and observe that my application takes 2 min to complete with MaxGCPauseMillis = 200 (default) and 4 min with MaxGCPauseMillis = 1. The heap usage depicted below. We can see from the statistics below that the GC time of both configs is different by only 5 sec.

I am wondering why execution time increases this much?

Some statistics:

MaxGCPauseMillis = 200 - No. young GCs: 67; GC time of an executor: 9.8 sec

MaxGCPauseMillis = 1 - No. young GCs: 224; GC time of an executor: 14.7 sec

Red area is area is young generation, black is old generation. The application runs on 10 nodes with 1 executor and 6 GB heap each.

The application is a Word Count example:

val lines = sc.textFile(args(0), 1)

val words = lines.flatMap(l => SPACE.split(l))
val ones = words.map(w => (w,1))
val counts = ones.reduceByKey(_ + _)

//val output = counts.collect()
//output.foreach(t => println(t._1 + ": " + t._2))
counts.saveAsTextFile(args(1))

which tool did you use to display it like this? – Ross Brigoli Nov 11 '19 at 06:48 — Ross Brigoli, Nov 11 '19 at 06:48

score 3 · Answer 1 · answered Feb 09 '16 at 08:05

MaxGCPauseMillis is an hint to the JVM that the overall pause times caused by GC should not be more than specified value (in milliseconds). Recommended value is 200 milliseconds for most of the production grade system.

Anything lower may force GC to run more number of times than it is required and would impact the overall throughput of the application, which is exactly happening in your case.

The number of young GCs is 67 while we configure MaxGCPauseMillis=200 and number of Young GC's is almost 4 times (224) when we configure MaxGCPauseMillis=1.

Refer here for more detailed explanations.

score 0 · Answer 2 · answered Feb 12 '16 at 01:31

0

Your intuition is wrong. Rather, theoretically, with a heap size chosen, throughput and latency (hinted by MaxGCPauseMillis in this case) have a counter effect. So when you lower MaxGCPauseMillis and hence latency, your throughput goes down too.

answered Feb 12 '16 at 01:31

Tao Mao

1

Why Spark application runs much slower with lower MaxGCPauseMillis?

2 Answers2