Why does standalone master schedule drivers on a worker?

Question

The schedule() in Master.scala shows the first schedule task is scheduling drivers on Workers. As Master will start only standalone mode, drivers will run on client out of Spark cluster.

Why does the master need schedule a Worker to run Driver?

score 1 · Accepted Answer · answered May 21 '17 at 10:59

If you are referring to private def schedule(): Unit, that method schedules the drivers of Spark applications that were spark-submit using --deploy-mode cluster.

From Launching Applications with spark-submit (that is linked from Cluster Mode Overview):

--deploy-mode Whether to deploy your driver on the worker nodes (cluster) or locally as an external client (client) (default: client)

In cluster deploy mode, the driver runs on a worker in a cluster (and is irrespective of the cluster manager, Spark Standalone, Hadoop YARN or Apache Mesos). As are Spark executors.

Why does standalone master schedule drivers on a worker?

1 Answers1