Can I force YARN to use the master node for the Application Master container?

Question

My big ol' master node hardware is doing practically nothing during my Hadoop/Spark runs because YARN uses a random slave node for its AM on each task. I like the old Hadoop 1 way better; lots of log chasing and ssh pain was avoided that way when things went wrong.

Is it possible?

technically YARN api has methods for these manipulations, like this one https://hadoop.apache.org/docs/r2.6.0/api/org/apache/hadoop/yarn/api/records/ApplicationSubmissionContext.html#setAMContainerResourceRequest(org.apache.hadoop.yarn.api.records.ResourceRequest) — AdamSkywalker, Jan 27 '17 at 23:30
but I've never seen any simple hadoop example to understand how to use it — AdamSkywalker, Jan 27 '17 at 23:32

score 1 · Accepted Answer · answered Jan 19 '18 at 15:37

1

It's possible with Spark and YARN node labels.

Labelize your nodes
Use spark.yarn.am.nodeLabelExpression properties

Good to read:

https://developer.ibm.com/hadoop/2017/03/10/yarn-node-labels/

answered Jan 19 '18 at 15:37

Thomas Decaux

21,738
2
113
124

Can I force YARN to use the master node for the Application Master container?

1 Answers1