1

I am new to linux and hadoop and I am having the same issue as in this question. I think I understand what is causing it but I don't know how to solve it (Don't know what they mean by "Edit the Hadoop server's configuration file so that it includes its NIC's address."). The other post that they link says that the configuration files should refer to the machine's externally accessible host name. I think I got this right as every hadoop configuration file refers to "master" and the etc/hosts file lists the master by its private IP address. How can I solve this?

Edit: I have 5 nodes: master, slavec, slaved, slavee and slavef all running debian. This is the hosts file in master:

127.0.0.1       master
10.0.1.201      slavec
10.0.1.202      slaved
10.0.1.203      slavee
10.0.1.204      slavef

this is the hosts file in slavec (it looks similar in the other slaves):

10.0.1.200      master
127.0.0.1       slavec
10.0.1.202      slaved
10.0.1.203      slavee
10.0.1.204      slavef

the masters file in master:

master

the slaves file in master:

master
slavec
slaved
slavee
slavef

the masters and slaves file in slavex has only one line: slavex

miguel
  • 111
  • 2
  • Are you able to describe your cluster setup (nodes, network config etc) in more detail? Possibly post your masters and slaves files? – Chris White Oct 02 '12 at 02:07
  • Have you already seen [this answer](http://stackoverflow.com/questions/4855808/hadoop-job-tracker-only-accessible-from-localhost) ? – Alexander Janssen Oct 02 '12 at 06:41
  • @Alex Yes, I have. That's what I meant by "The other post that they link ..." but I am not sure I got it right. – miguel Oct 02 '12 at 10:37

1 Answers1

0

First, this is a great tutorial to start on Hadoop: http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

Second, analyzing your environment, seems that /etc/hosts, "master" and "slave" files are misconfigured. You could setup one "hosts" file and share to all nodes. From your scenario, it should be like this:

127.0.0.1      localhost
10.0.1.200      master
10.0.1.201      slavec
10.0.1.202      slaved
10.0.1.203      slavee
10.0.1.204      slavef

Third, "master" and "slaves" files should be configured only on the "master" node. The first file has only the server that will run JobTracker and NameNode, the second file has all servers that will run TaskTracker and DataNode.