9

I´m trying to run the following Spark example under Hadoop 2.6, but I get the following error:

INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 and the Client enters in a loop trying to connect. I´m running a cluster of two machines, one master and a slave.

./bin/spark-submit --class org.apache.spark.examples.SparkPi \
--master yarn-cluster \
--num-executors 3 \
--driver-memory 2g \
--executor-memory 2g \
--executor-cores 1 \
--queue thequeue \
lib/spark-examples*.jar \
10

This is the error I get:

15/12/06 13:38:28 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable  
15/12/06 13:38:29 INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8032  
15/12/06 13:38:30 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)  
15/12/06 13:38:31 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)   
15/12/06 13:38:32 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)   
15/12/06 13:38:33 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)   
15/12/06 13:38:34 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

jps

hduser@master:/usr/local/spark$ jps

4930 ResourceManager 
4781 SecondaryNameNode 
5776 Jps 
4608 DataNode 
5058 NodeManager 
4245 Worker 
4045 Master 

My /etc/host/

/etc/hosts

192.168.0.1 master 
192.168.0.2 slave 

The following lines are desirable for IPv6 capable hosts

::1     ip6-localhost ip6-loopback 

fe00::0 ip6-localnet 

ff00::0 ip6-mcastprefix 

ff02::1 ip6-allnodes 
Jose Antonio
  • 91
  • 1
  • 1
  • 4

7 Answers7

3

This error mainly comes when hostname is not configured correctly ...Please check if hostname is configured correctly and same as you have mentioned for Resourcemanager...

Naruto
  • 4,221
  • 1
  • 21
  • 32
3

I had faced the same problem. I solved it.

Do the Following steps.

  1. Start Yarn by using command: start-yarn.sh
  2. Check Resource Manager by using command: jps
  3. Add the following code to the configuration

<property>
   <name>yarn.resourcemanager.address</name>
   <value>127.0.0.1:8032</value>
</property>
OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
sunanda
  • 101
  • 3
1

I had also encountered the same issue where I was not able to submit the spark job with spark-submit.

The issue was due to the missing HADOOP_CONF_DIR path while launching the Spark job So, whenever you are submitting the job, set HADOOP_CONF_DIR to appropriate HADOOP CONF directory. Like export HADOOP_CONF_DIR=/etc/hadoop/conf

AKs
  • 1,727
  • 14
  • 18
0

You need to make sure that yarn-site.xml is on the class path and also make sure that the relevant properties are marked with true element.

0

Similar export HADOOP_CONF_DIR=/etc/hadoop/conf was a good idea for my case in flink on yarn when i run ./bin/yarn-session.sh -n 2 -tm 2000.

Matiji66
  • 709
  • 7
  • 14
0

As you can see here yarn.resourcemanager.address is calculated based on yarn.resourcemanager.hostname which its default value is set to 0.0.0.0. So you should configure it correctly.
From the base of the Hadoop installation, edit the etc/hadoop/yarn-site.xml file and add this property.

  <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>localhost</value>
  </property>

Exucuting start-yarn.sh again will put your new settings into effect.

mahyard
  • 1,230
  • 1
  • 13
  • 34
0

I have got the same problem. My cause is that the times are not the same between machines since my Resource Manager is not on the master machine. Just one second difference can cause yarn connection problem. A few more seconds difference can cause your name node and date node unable to start. Use ntpd to configure time synchronization to make sure the times are exactly same.

Jing He
  • 794
  • 1
  • 9
  • 17