I am trying to run Pyspark (or Spark) job via Livy server with "spark.master=yarn".
What I have done:
1) In spark-defaults.conf:
spark.master yarn
spark.submit.deployMode client
2) In livy.conf:
livy.spark.master = yarn
livy.spark.deployMode = client
3) I send request via CURL with "conf": {"spark.master": "yarn"}
Example:
curl -X POST -H "Content-Type: application/json" localhost:8998/batches --data '{"file": "hdfs:///user/grzegorz/hello-world.py", "name": "MY", "conf": {"spark.master": "yarn"} }'
{"id":3,"state":"running","appId":null,"appInfo":{"driverLogUrl":null,"sparkUiUrl":null},"log":["stdout: ","\nstderr: "]}
And what I am always getting in logs:
18/01/02 14:45:07.880 qtp1758624236-28 INFO BatchSession$: Creating batch session 3: [owner: null, request: [proxyUser: None, file: hdfs:///user/grzegorz/hello-world.py, name: MY, conf: spark.master -> yarn]]
18/01/02 14:45:07.883 qtp1758624236-28 INFO SparkProcessBuilder: Running '/usr/local/share/spark/spark-2.0.2/bin/spark-submit' '--name' 'MY' '--conf' 'spark.master=local' 'hdfs:///user/grzegorz/hello-world.py'
I hope somebody have any ideas how to get it over. Thank you in advance.