I have a DataProc Spark cluster. Initially, the master and 2 worker nodes are of type n1-standard-4 (4 vCPU, 15.0 GB memory), then I resized all of them to n1-highmem-8 (8 vCPUs, 52 GB memory) via the web console.
I noticed that the two workers nodes are not being fully used. In particular, there are only 2 executors on the first worker node and 1 executor on the second worker node, with
spark.executor.cores 2
spark.executor.memory 4655m
in the /usr/lib/spark/conf/spark-defaults.conf
. I thought with spark.dynamicAllocation.enabled true
, the number of executors will be increased automatically.
Also, The information on DataProc page of the web console doesn't get updated automatically, either. It seems that DataProc still think that all nodes are n1-standard-4.
My questions are
- why are there more executors on the first worker node than the second?
- why are not more executors added to each node?
- Ideally, I want the whole cluster to get fully utilized, if the spark configuration needs updated, how?