14

I am doing a smoke test against a yarn cluster using yarn-cluster as the master with the SparkPi example program. Here is the command line:

  $SPARK_HOME/bin/spark-submit --master yarn-cluster 
 --executor-memory 8G --executor-cores 240 --class org.apache.spark.examples.SparkPi 

examples/target/scala-2.11/spark-examples-1.4.1-hadoop2.7.1.jar

The yarn accepts the job but then complains about a "bad substitution". Maybe it is on the hdp.version ??

15/09/01 21:54:05 INFO yarn.Client: Application report for application_1441066518301_0013 (state: ACCEPTED)
15/09/01 21:54:05 INFO yarn.Client:
     client token: N/A
     diagnostics: N/A
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1441144443866
     final status: UNDEFINED
     tracking URL: http://yarnmaster-8245.lvs01.dev.ebayc3.com:8088/proxy/application_1441066518301_0013/
     user: stack
15/09/01 21:54:06 INFO yarn.Client: Application report for application_1441066518301_0013 (state: ACCEPTED)
15/09/01 21:54:10 INFO yarn.Client: Application report for application_1441066518301_0013 (state: FAILED)
15/09/01 21:54:10 INFO yarn.Client:
     client token: N/A
     diagnostics: Application application_1441066518301_0013 failed 2 times due to AM Container for appattempt_1441066518301_0013_000002 exited with  exitCode: 1
For more detailed output, check application tracking page:http://yarnmaster-8245.lvs01.dev.ebayc3.com:8088/cluster/app/application_1441066518301_0013Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e03_1441066518301_0013_02_000001
Exit code: 1
Exception message: /mnt/yarn/nm/local/usercache/stack/appcache/
application_1441066518301_0013/container_e03_1441066518301_0013_02_000001/
launch_container.sh: line 24: $PWD:$PWD/__hadoop_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:
/usr/hdp/current/hadoop-client/*::$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:
/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-.6.0.${hdp.version}.jar:
/etc/hadoop/conf/secure: bad substitution

Stack trace: ExitCodeException exitCode=1: /mnt/yarn/nm/local/usercache/stack/appcache/application_1441066518301_0013/container_e03_1441066518301_0013_02_000001/launch_container.sh: line 24: $PWD:$PWD/__hadoop_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution

    at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
    at org.apache.hadoop.util.Shell.run(Shell.java:456)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Of note here is:

/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-.6.0.${hdp.version}.jar:
/etc/hadoop/conf/secure: bad substitution

The "sh" is linked to bash:

$ ll /bin/sh
lrwxrwxrwx 1 root root 4 Sep  1 05:48 /bin/sh -> bash
WestCoastProjects
  • 58,982
  • 91
  • 316
  • 560
  • I'm not sure about this, but I guess your `/bin/sh` is dash, not bash. This could probably be the problem. See `man sh` to double check. – xuhdev Sep 01 '15 at 23:03
  • @xuhdev No i only use bash to maintain compatibility. – WestCoastProjects Nov 06 '15 at 17:00
  • I had the same problem, added the right hdp version in spark-defaults.conf, but the error remains. Following the path /usr/hdp/${hdp.version}/hadoop/lib/ on the cluster, I found no hadoop-lzo-.6.0.${hdp.version}.jar exists...?! – dieHellste Aug 11 '16 at 11:37

6 Answers6

24

It is caused by hdp.version not getting substituted correctly. You have to set hdp.version in the file java-opts under $SPARK_HOME/conf.

And you have to set

spark.driver.extraJavaOptions -Dhdp.version=XXX 
spark.yarn.am.extraJavaOptions -Dhdp.version=XXX

in spark-defaults.conf under $SPARK_HOME/conf where XXX is the version of hdp.

Nadine
  • 1,620
  • 2
  • 15
  • 27
zhang zhan
  • 1,596
  • 13
  • 10
13

If you are using spark with hdp, then you have to do the following things:

Add these entries in $SPARK_HOME/conf/spark-defaults.conf

spark.driver.extraJavaOptions -Dhdp.version=2.2.0.0-2041 (your installed HDP version)

spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0-2041 (your installed HDP version)

Create a file called java-opts in $SPARK_HOME/conf and add the installed HDP version to that file like this:

-Dhdp.version=2.2.0.0-2041 (your installed HDP version)

To figure out which hdp version is installed, please run this command in the cluster:

hdp-select status hadoop-client
Nadine
  • 1,620
  • 2
  • 15
  • 27
Sudarsan
  • 221
  • 3
  • 5
5

I had the same issue:

launch_container.sh: line 24: $PWD:$PWD/__hadoop_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*::$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution

As I couldn't find any /usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo* file, I just edited mapred-site.xml and removed "/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:"

Ra_
  • 264
  • 2
  • 7
2
  1. Go to the ambari-yarn.

click on Configs->Advanced->Custom yarn-site->Add Property ...

add hdp version as key and value as your HDP version. You will get hdp version along with below command

hdp-select versions

e.g. 2.5.3.0-37

Now add you property as

hdp.version=2.5.3.0-37

  1. Otherwise replace ${hdp.version} as your hdp version(2.5.3.0-37) in yarn-site.xml and yarn-env.sh
Balkrushna Patil
  • 418
  • 6
  • 12
1

I also had this Issue using BigInsights 4.2.0.0 with yarn, spark and mapreduce 2 and what was causing it was the iop.version. To fix it you have to add the iop.version variable to mapred-site, and this can be done into with the following steps:

In Ambari Server go to:

  • MAPREDUCE2
  • Configs (tab)
  • Advanced (tab)
  • Click into Custom mapred-site
  • Add Property...
  • Put iop.version and your BigInsights version.
  • Restart all services.

This has fixed it.

dirceusemighini
  • 1,344
  • 2
  • 16
  • 35
0

This may be caused by /bin/sh linked to dash, instead of bash, which often happens on Debian based systems.

To fix it, run sudo dpkg-reconfigure dash and select no.

xuhdev
  • 8,018
  • 2
  • 41
  • 69