6

I have a DataProc Spark cluster. Initially, the master and 2 worker nodes are of type n1-standard-4 (4 vCPU, 15.0 GB memory), then I resized all of them to n1-highmem-8 (8 vCPUs, 52 GB memory) via the web console.

I noticed that the two workers nodes are not being fully used. In particular, there are only 2 executors on the first worker node and 1 executor on the second worker node, with

spark.executor.cores 2
spark.executor.memory 4655m

in the /usr/lib/spark/conf/spark-defaults.conf. I thought with spark.dynamicAllocation.enabled true, the number of executors will be increased automatically.

Also, The information on DataProc page of the web console doesn't get updated automatically, either. It seems that DataProc still think that all nodes are n1-standard-4.

My questions are

  1. why are there more executors on the first worker node than the second?
  2. why are not more executors added to each node?
  3. Ideally, I want the whole cluster to get fully utilized, if the spark configuration needs updated, how?
zyxue
  • 7,904
  • 5
  • 48
  • 74

1 Answers1

6

As you've found a cluster's configuration is set when the cluster is first created and does not adjust to manual resizing.

To answer your questions:

  1. The Spark ApplicationMaster takes a container in YARN on a worker node, usually the first worker if only a single spark application is running.
  2. When a cluster is started, Dataproc attempts to fit two YARN containers per machine.
  3. The YARN NodeManager configuration on each machine determines how much of the machine's resources should be dedicated to YARN. This can be changed on each VM under /etc/hadoop/conf/yarn-site.xml, followed by a sudo service hadoop-yarn-nodemanager restart. Once machines are advertising more resources to the ResourceManager, Spark can start more containers. After adding more resources to YARN, you may want to modify the size of containers requested by Spark by modifying spark.executor.memory and spark.executor.cores.

Instead of resizing cluster nodes and manually editing configuration files afterwards, consider starting a new cluster with new machine sizes and copy any data from your old cluster to the new cluster. In general, the simplest way to move data is to use hadoop's built in distcp utility. An example usage would be something along the lines of:

$ hadoop distcp hdfs:///some_directory hdfs://other-cluster-m:8020/

Or if you can use Cloud Storage:

$ hadoop distcp hdfs:///some_directory gs://<your_bucket>/some_directory

Alternatively, consider always storing data in Cloud Storage and treating each cluster as an ephemeral resource that can be torn down and recreated at any time. In general, any time you would save data to HDFS, you can also save it as:

gs://<your_bucket>/path/to/file

Saving to GCS has the nice benefit of allowing you to delete your cluster (and data in HDFS, on persistent disks) when not in use.

Angus Davis
  • 2,673
  • 13
  • 20
  • Thanks. I figured out the first two points myself yesterday. As for the third point, I killed the nordemanager on one of the worker nodes (`w-0`) with `kill kill -9` and tried starting it with `sudo yarn nodemanager`, but it turns out the system seems to be messed. Even though the number of executors increased, but they all failed to finish tasks for some reason. What's the different between `sudo yarn nodemanager` and `sudo service hadoop-yarn-nodemanager restart`? – zyxue Aug 04 '16 at 19:47
  • Also, what's the recommended way to transfer data on HDFS between clusters? The data is rather large. – zyxue Aug 04 '16 at 19:49
  • I feel fitting two YARN containers per machine and having ApplicationMaster occupying a single container is quite inefficient, esp. when the machine is large. Is there a way to configure this when creating the cluster? – zyxue Aug 04 '16 at 19:50
  • 1
    Thanks for following up! I've updated the answer with a couple of distcp examples. As for the difference between `sudo service ...` and `yarn nodemanager`, the service file in /etc/init.d/hadoop-yarn-nodemanager does a little bit of environment setup before starting the daemon. You can indeed change the size of app masters. When starting a cluster `--properties=spark:spark.yarn.am.memory=XXXm` or when submitting a job: `--properties spark.yarn.am.memory=XXXm`. – Angus Davis Aug 04 '16 at 20:24