0

While i have several documentation that suggest that a driver run on its own node which is the master and the executors on slave node also called Worker, I come to somehow get confused with that. Hence i would like to confirm the following if possible:

where does the Driver run on a cluster of type:

  1. Standalone
  2. Yarn
  3. Mesos

So i think i have the answer for 2 which is driver in master. However i am unsure for 1 and 3. Can someone help clarify ?

Finally if a Driver share a node with an executor, does that mean when we size the cluster node we need to take into account that more thread might actually run on it and the memory usage might be bigger ? In other words, we should systematically oversize our node to account for a potential driver.

MaatDeamon
  • 9,532
  • 9
  • 60
  • 127

1 Answers1

2

All cluster managers have the notion of Client mode and Cluster mode. Client mode means that the driver runs from the location in which the submission request was created. This doesn't mean the driver must be executed from the master node, it will only do so if you submit the application from the master.

For example, If I submit the application from my local IDE to the Spark Master, the driver will run on my local machine.

Yuval Itzchakov
  • 146,575
  • 32
  • 257
  • 321
  • Wooow that is an interesting twist that I did not thought guy about at all. I guess I never reflect on the utility of having a client mode in the first place. – MaatDeamon Aug 09 '17 at 14:30
  • Indeed this will force the driver to run where the submit happen and indeed fix his location. But I guess at the point for completion purpose, one need to clarify what is the purpose of both mode in the first place. I know force instance that the will not manage the restart of your driver when you are in client mode. That is somehow written everywhere in documentation. If someone wants that behavior, the driver needs to be on the cluster separated from the client. – MaatDeamon Aug 09 '17 at 14:34
  • https://stackoverflow.com/questions/28807490/what-conditions-should-cluster-deploy-mode-be-used-instead-of-client give some hints – MaatDeamon Aug 09 '17 at 14:40
  • Do you have any deploy mode recommendation given your experience in developing and deploying spark application ? – MaatDeamon Aug 09 '17 at 17:36
  • @MaatDeamon Really depends on the nature of you app. Is it a Streaming job? Batch job? How many apps would be running on the cluster? It varies... – Yuval Itzchakov Aug 09 '17 at 17:47
  • 1 Streaming app on a cluster. If you have some criteria hints, if not i can open a new question on that anyway – MaatDeamon Aug 09 '17 at 17:49
  • @MaatDeamon If its a single streaming app I don't think it matters much. The traditional way is to run it in cluster mode with the `--supervise` flag to make sure the driver is fault tolerant. Note this has a draw back that the driver will take resources from a Worker node, which you need to plan ahead and make sure you have sufficient memory. – Yuval Itzchakov Aug 09 '17 at 17:52
  • That would mean revisiting my resource planning and accounting for a driver. How does it usually affect your calculation ? I planned my node for true parallelism, not concurrency. But that did not account for the driver picking on worker node and executing itself on it. Beside there is no way to know where the driver might end up. So all the node must be prepare to hold a driver. I know the driver is not alot, maybe just one core or two, but still, it means one core or two on that machine being unavailable at time. – MaatDeamon Aug 09 '17 at 17:55
  • Any return from experience ? – MaatDeamon Aug 09 '17 at 17:55
  • @MaatDeamon What you'ee saying is right.. If you have a dedicated master node, it mght make sense to execute the driver on it. Note that you won't have supervised execution and will have to manually restart it. Make sure you check HA master with ZooKeeper. – Yuval Itzchakov Aug 09 '17 at 18:05
  • Thanks @Yuval Itzchakov – MaatDeamon Aug 09 '17 at 19:47
  • The question was about "cluster" deployMode, you answered the wrong question and misled everyone. – pferrel Mar 23 '19 at 19:57