I want to run a spark streaming application on a yarn cluster on a remote server. The default java version is 1.7 but i want to use 1.8 for my application which is also there in the server but is not the default. Is there a way to specify through spark-submit the location of java 1.8 so that i do not get major.minor error ?
-
2you use maven? If so you can specify the java version in the pom.xml – M. Suurland Apr 26 '16 at 11:30
-
4maybe you can set JAVA_HOME just before you spark-submit. like this: "JAVA_HOME=/path/to/java ./bin/spark-submit ......" – Hlib Apr 26 '16 at 11:40
-
3setting JAVA_HOME before the spark-submit command worked for me. Thanks :) – Priyanka Apr 26 '16 at 12:43
-
1@Hlib , doing so changed the java version for the current application for the driver and not the executors in the cluster which also have their default java version as 1.7. Can you suggest a workaround for that as well ? – Priyanka Apr 27 '16 at 05:40
-
1did you try to specify JAVA_HOME in $SPARK_HOME$/conf/spark-env.sh? – Hlib Apr 27 '16 at 08:21
-
or it is better to put it here: $HADOOP_HOME$/etc/hadoop/yarn-env.sh – Hlib Apr 27 '16 at 08:31
-
But that would affect other applications running in the same cluster. So i changed my code to run with Java 7. Thanks :) – Priyanka Apr 27 '16 at 09:21
5 Answers
JAVA_HOME was not enough in our case, the driver was running in java 8, but I discovered later that Spark workers in YARN were launched using java 7 (hadoop nodes have both java version installed).
I had to add spark.executorEnv.JAVA_HOME=/usr/java/<version available in workers>
in spark-defaults.conf
. Note that you can provide it in command line with --conf
.
See http://spark.apache.org/docs/latest/configuration.html#runtime-environment

- 2,330
- 2
- 24
- 44
-
2For those who don't have access / permission to check java version on worker nodes, use `spark.range(0, 100).mapPartitions(_.map(_ => java.lang.System.getProperty("java.version"))).show` for sanity check. It might be too hard to determine runtime java version via yarn / spark UI – shay__ Jan 02 '18 at 12:52
-
1Both _spark.executorEnv.JAVA_HOME_ and _spark.yarn.appMasterEnv.JAVA_HOME_ need to be set. – Avinash Ganta Nov 15 '19 at 09:55
Although you can force the Driver code to run on a particular Java version (export JAVA_HOME=/path/to/jre/ && spark-submit ...
), the workers will execute the code with the default Java version from the yarn user's PATH from the worker machine.
What you can do is set each Spark instance to use a particular JAVA_HOME
by editing the spark-env.sh
files (documentation).

- 2,022
- 3
- 17
- 28
If you want to set java environment for spark on yarn, you can set it before spark-submit
--conf spark.yarn.appMasterEnv.JAVA_HOME=/usr/java/jdk1.8.0_121 \

- 499
- 2
- 12
- 24
Add JAVA_HOME that you want in spark-env.sh (sudo find -name spark-env.sh ...ej. : /etc/spark2/conf.cloudera.spark2_on_yarn/spark-env.sh)

- 200
- 1
- 12
The Java version would need to be set for both the Spark App Master and the Spark Executors which will be launched on YARN. Thus the spark-submit command must include two JAVA_HOME settings: spark.executorEnv.JAVA_HOME and spark.yarn.appMasterEnv.JAVA_HOME
spark-submit --class com.example.DataFrameExample --conf "spark.executorEnv.JAVA_HOME=/jdk/jdk1.8.0_162" --conf "spark.yarn.appMasterEnv.JAVA_HOME=/jdk/jdk1.8.0_162" --master yarn --deploy-mode client /spark/programs/DataFrameExample/target/scala-2.12/dfexample_2.12-1.0.jar

- 153
- 2
- 2