1

I have a 20 mode c4.4xlarge cluster to run a spark job. Each node is a 16 vCore, 30 GiB memory, EBS only storage EBS Storage:32 GiB machine.

Since each node has 16 vCore, I understand that maximum number of executors are 16*20 > 320 executors. Total memory available is 20(#nodes)*30 ~ 600GB. Assigning 1/3rd to system operations, I have 400 GB of Memory to process my data in-memory. Is this the right understanding.

Also, Spark History shows non-uniform distribution of input and shuffle. I believe the processing is not distributed evenly across executors. I pass these config parameters in my spark-submit -

> —-conf spark.dynamicAllocation.enabled=true  —-conf spark.dynamicAllocation.minExecutors=20

Executor summary from spark history UI also shows that data distribution load is completely skewed, and I am not using the cluster in the best way. How can I distribute my load in a better way -

enter image description here

Abhi
  • 1,153
  • 1
  • 23
  • 38
  • This was done running a single job in the cluster. However, leveraging Fair-scheduler has increased overall parallelism and cluster utilization, thus giving me better throughput – Abhi Jan 09 '19 at 20:44

0 Answers0