9

This is our first steps using big data stuff like apache spark and hadoop.

We have a installed Cloudera CDH 5.3. From the cloudera manager we choose to install spark. Spark is up and running very well in one of the nodes in the cluster.

From my machine I made a little application that connects to read a text file stored on hadoop HDFS.

I am trying to run the application from Eclipse and it displays these messages

15/02/11 14:44:01 INFO client.AppClient$ClientActor: Connecting to master spark://10.62.82.21:7077... 15/02/11 14:44:02 WARN client.AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@10.62.82.21:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@10.62.82.21:7077 15/02/11 14:44:02 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@10.62.82.21:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: no further information: /10.62.82.21:7077

The application is has one class the create a context using the following line

JavaSparkContext sc = new JavaSparkContext(new SparkConf().setAppName("Spark Count").setMaster("spark://10.62.82.21:7077"));

where this IP is the IP of the machine spark working on.

Then I try to read a file from HDFS using the following line

sc.textFile("hdfs://10.62.82.21/tmp/words.txt")

When I run the application I got the

WestCoastProjects
  • 58,982
  • 91
  • 316
  • 560
Fanooos
  • 2,718
  • 5
  • 31
  • 55
  • do you have more than one IPs configured for the same machine? – Harman Feb 11 '15 at 13:16
  • actually, I have no idea but the same exception is thrown if I used the domain name instead of the IP. – Fanooos Feb 11 '15 at 13:18
  • what do you see when you fire ifconfig – Harman Feb 12 '15 at 06:02
  • I have checked with the sys admin and the machine has only one IP. Actually I suspect the installation of Spark. There is a spark process running on the machine (pgrep -f spark replies with process id) but when we fire spark-shell, it opens the scala shell after displaying some exceptions. is there a way to make sure spark is properly installed? – Fanooos Feb 12 '15 at 07:33
  • what are the exceptions that you get once you open the shell? are you getting connected to the master ? – Harman Feb 12 '15 at 10:46
  • Has you ever sorted it out? – Jacek Laskowski Nov 22 '15 at 14:37
  • I just posted an answer where I fixed this problem and found most of the config parameters do not need to be set. Also to @Fanoos you might want to clean up the end of your question? Did you leave off the error? And perhaps click a check mark to the left of the answer you like, to mark it accepted. If none covers your case please answer your own question with whatever worked! – JimLohse Dec 28 '15 at 18:53

3 Answers3

6

Check your Spark master logs, you should see something like:

15/02/11 13:37:14 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkMaster@mymaster:7077]
15/02/11 13:37:14 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkMaster@mymaster:7077]
15/02/11 13:37:14 INFO Master: Starting Spark master at spark://mymaster:7077

Then when your connecting to the master, be sure to use exactly the same hostname as found in the logs above (do not use the IP address):

.setMaster("spark://mymaster:7077"));

Spark standalone is a bit picky with this hostname/IP stuff.

G Quintana
  • 4,556
  • 1
  • 22
  • 23
  • The machine has a domain name and the same exception is thrown if I used it instead of IP. – Fanooos Feb 11 '15 at 13:19
  • Another Spark standalone pitfall is that it acts as a peer to peer system. Your client/driver application must be joinable by the master. You may have disable your firewall settings and add to your SparkConf: .set("spark.driver.host", "mydriverapp") .set("spark.driver.port", "7076") – G Quintana Feb 11 '15 at 16:57
4

When you create your Spark master using the shell command "sbin/start-master.sh". go the the address http://localhost:8080 and check the "URL" row.

HELLO
  • 41
  • 1
  • This is good advice, at the same time in my testing today I found if the URL showed a hostname, it would not connect no matter what. Only when I set one setting (SPARK_MASTER_IP) and used IP addresses would it connect. – JimLohse Dec 28 '15 at 20:53
1

I notice no accepted answer, just for info I thought I'd mention a couple things.

First, in the spark-env.sh file in the conf directory, the SPARK_MASTER_IP and SPARK_LOCAL_IP settings can be hostnames. You don't want them to be, but they can be.

As noted in another answer, Spark can be a little picky about hostname vs. IP address, because of this resolved bug/feature: See bug here. The problem is, it's not clear if they "resolved" is simply by telling us to use IP instead of hostname?

Well I am having this same problem right now, and the first thing you do is check the basics.

Can you ping the box where the Spark master is running? Can you ping the worker from the master? More importantly, can you password-less ssh to the worker from the master box? Per 1.5.2 docs you need to be able to do that with a private key AND have the worker entered in the conf/slaves file. I copied the relevant paragraph at the end.

You can get a situation where the worker can contact the master but the master can't get back to the worker so it looks like no connection is being made. Check both directions.

Finally of all the combinations of settings, in a limited experiment just now I only found one that mattered: On the master, in spark-env.sh, set the SPARK_MASTER_IP to the IP address, not hostname. Then connect from the worker with spark://192.168.0.10:7077 and voila it connects! Seemingly none of the other config parameters are needed here.

Here's the paragraph from the docs about ssh and slaves file in conf:

To launch a Spark standalone cluster with the launch scripts, you should create a file called conf/slaves in your Spark directory, which must contain the hostnames of all the machines where you intend to start Spark workers, one per line. If conf/slaves does not exist, the launch scripts defaults to a single machine (localhost), which is useful for testing. Note, the master machine accesses each of the worker machines via ssh. By default, ssh is run in parallel and requires password-less (using a private key) access to be setup. If you do not have a password-less setup, you can set the environment variable SPARK_SSH_FOREGROUND and serially provide a password for each worker.

Once you have done that, using the IP address should work in your code. Let us know! This can be an annoying problem, and learning that most of the config params don't matter was nice.

JimLohse
  • 1,209
  • 4
  • 19
  • 44