Spark submit not respecting spark.driver.supervise=true set in spark-defaults.conf

Question

What issue I faced?

I specified spark.driver.supervise=true in spark-defaults.conf
${SPARK_HOME}/bin/spark-submit script did not pick this configuration. It started job/driver with spark.driver.supervise=false
PS: So far I have only found spark.driver.supervise property not working via spark-defaults.conf way. Other properties that I am using are working perfectly fine.

How to reproduce the issue?

Add spark.driver.supervise=true OR spark.driver.supervise true in spark-defaults.conf
Submit a spark application using ${SPARK_HOME}/bin/spark-submit script
The job/driver will NOT be running in supervision mode. To check this, goto <DRIVER_URL>:4040/environment/ and check for spark.driver.supervise property

What was I expecting?

I was hoping that by default, even when I do not pass --supervise argument explicitly to spark-submit command, it should have picked up the same from spark-defaults.conf
I got this understanding of specifying default spark properties in spark-defaults.conf by reading the spark configuration docs.

What makes this issue important?

Apache Livy is a REST client that helps submit spark jobs via HTTP requests.
Livy is used by so many organizations and AWS EMR, Cloudera uses Livy under the hood.
Livy internally invokes ${SPARK_HOME}/bin/spark-submit shell script.
As per Livy's code, there is no way to specify supervise option

An ugly hack to make Livy/spark-submit work in supervision mode by default

open ${SPARK_HOME}/bin/spark-submit in shell editor
change line exec "${SPARK_HOME}"/bin/spark-class org.apache.spark.deploy.SparkSubmit "$@" TO exec "${SPARK_HOME}"/bin/spark-class org.apache.spark.deploy.SparkSubmit "--supervise" "$@"
Notice the extra shell argument in double quotes "--supervise". This does the trick and Livy/spark-submit works.

You still can try to pass it with `..., "conf": { "spark.driver.supervise": true }, ...` in the POST request body. Can you check if it works that way? Also note that by Spark docs this config `Only has effect in Spark standalone mode or Mesos cluster deploy mode`, for which Livy support is limited. — Aliaksandr Sasnouskikh, Jan 21 '20 at 10:58
I have tried specifying it in the Livy POST body's conf part. It did not work. I then checked Livy's [source code](https://github.com/apache/incubator-livy/blob/master/server/src/main/scala/org/apache/livy/utils/SparkProcessBuilder.scala) and found that they have not handled the case of `spark.driver.supervise`. Ideally, Livy should have a code that will check if `spark.driver.supervise` is set as true in POST params conf part and if yes then they should append `--supervise` when invoking smark-submit command/script. And yes, I am using spark standalone cluster mode. — surajs21, Jan 21 '20 at 11:32
Now let's think only from the `spark-submit`'s perspective. As per spark config [doc](https://spark.apache.org/docs/latest/configuration.html#application-properties), they support `spark.driver.supervise` config to be added in spark-defaults.conf. So, if what config doc is saying is true, then it should have worked when invoking `spark-submit` command/script from CLI manually. Even without me specifying `--supervise` args, spark-submit should have picked it up from spark-defaults.conf. So, either there is a bug in spark code OR my understanding is wrong. Someone, please educate me more on this — surajs21, Jan 21 '20 at 11:43
I've checked the [Livy code](https://github.com/apache/incubator-livy/blob/master/server/src/main/scala/org/apache/livy/utils/SparkProcessBuilder.scala#L183-L184) - it sets all the `conf` values with the builder. So I feel that somewhere `SparkSubmit` flow overrides the `--conf spark.driver.supervise`. Will try to debug that a bit later. Can you share the Spark/Livy versions you use? And have you made any customizations? — Aliaksandr Sasnouskikh, Jan 21 '20 at 11:47
Spark version = 2.4.4, Hadoop version = 3.1.3, Livy version = 0.6.0. The only customization that I did is that I am NOT using the "spark-and-hadoop" bundled package. I have downloaded `spark-2.4.4-bin-without-hadoop.tgz` and `hadoop-3.1.3.tar.gz` separately and using it by specifying `export SPARK_DIST_CLASSPATH=$({{hadoop.install_dir}}/bin/hadoop classpath)` — surajs21, Jan 21 '20 at 12:03
Did you figure this out? I am trying to tell spark to use VLAN-correct interfaces and nothing I set in `spark-env.sh` or `spark-defaults.conf` seems to be taking. — dannyman, Jul 28 '21 at 17:17

Spark submit not respecting spark.driver.supervise=true set in spark-defaults.conf

What issue I faced?

How to reproduce the issue?

What was I expecting?

What makes this issue important?

An ugly hack to make Livy/spark-submit work in supervision mode by default

0 Answers0