0

I got Hortonworks Sandbox 2.0 and it is running fine. Now I want to try to connect from Eclipse, but I am unable to.

  • Here is the Hadoop location on Eclipse:
    Map/Reduce master: host: localhost, port: 50020
    DFS Master: host: localhost, port:50040

The error is:

Call to localhost/127.0.0.1:50040 failed on connection exception: java.net.ConnectionException: Connection refused: no further information.

I think the error could be a mismatched Hadoop plugin since I am using Hadoop Eclipse plugin 1.2.0 but I could not find an Eclipse plugin for Hadoop 2.2

The answer from thread How to use java to connect hadoop 2.2.0 server says that currently there is no Eclipse plugin for Hadoop 2.2.0? Can someone guide me through this?

Community
  • 1
  • 1
ngunha02
  • 1,719
  • 1
  • 19
  • 24
  • just a guess: try to switch firewall and other security tools off temporarily – xhudik Nov 24 '13 at 12:07
  • I just did, but no luck. I did switch to hortonworks 1.3 but error still occurred. I think i need to find another way to learn mapreduce with java then – ngunha02 Nov 24 '13 at 12:20
  • hmm it looks more like a connectivity problem. does 127.0.0.1:50070 works in your browser? If not - it is definitely problem with VirtualBox connectivity – xhudik Nov 25 '13 at 18:44
  • Well, virtualbox says: 127.0.0.1:8888, but when i looked around, everyone on here says something about using port 50030, 50040,...as a default port?! Well, i am not sure if you've used Hortonworks, but to clarify, when it says eclipse, it meant to be eclipse on the host computer or eclipse in the virtualbox (Hortonworks sandbox does not provide x GUI, but Cloudera sandbox has x GUI with eclipse preinstalled!) – ngunha02 Nov 25 '13 at 19:05

3 Answers3

1

Hadoop 2.2.0 version don't use jobtracker as a whole. The yarn has split the work of job tracker in two half see apache hadoop documentation.

First of all go to the mapred-site.xml add the below properties with the other mentioned in all installation steps:-

mapreduce.jobtracker.address

localhost:54311

mapreduce.jobtracker.http.address

0.0.0.0:50030

mapreduce.jobhistory.address

0.0.0.0:10020

mapreduce.jobhistory.webapp.address

0.0.0.0:19888

and after that configure your hadoop location as:-

Map/Reduce (V2) Master

Host: localhost

port: 54311

DFS Master

check the checkbox use M/R master host

port: 9000**

Now every thing will be fine.

Anumoy Sutradhar
  • 523
  • 5
  • 11
0

Jacky, there are different ports for different services.

50070 is HDFS service which every Hadoop system has (therefore I suggested it).

8888 is likely just a hortonworks' port for some specific web service.

50030 is Jobtracker port

First of all, make sure that you can connect/communicate with your VirtualBox (host os can access guest services). If so, find out what service/port you need e.g. Jobtracker API error - Call to localhost/127.0.0.1:50030 failed on local exception: java.io.EOFException

It can be a lot of work - so if you know Cloudera distribution has all things you need - go for Cloudera

Community
  • 1
  • 1
xhudik
  • 2,414
  • 1
  • 21
  • 39
  • Thank you for explaining differences between ports. And yes, I am going for Cloudera now. I am going to accept your answer. – ngunha02 Nov 26 '13 at 20:06
  • just one more comment: it seems cloudera as well as hortonworks provides their demo products with one node only. This is good if you are complete novice, however, if you really want to understand hadoop -it is not enough. I have experiences with Aster DB (big data DB, in some way similar to Hadoop). Aster demo (called Aster express) provides 1 queen(namenode) and 2 workers (datanodes) - much better solution if you want to see/feel parallel computing in reality – xhudik Nov 27 '13 at 11:55
  • Thanks!! I am complete novice. I'll work with Hortonworks and Cloudera for a while to get myself familiar then i sure check out Aster DB! – ngunha02 Nov 27 '13 at 19:33
0

I found myself in similar situation when I was unable to connect to Hive server in HortonWorks sandbox. What I found that the Virtual Image you use for SandBox uses NAT for networking. So what it means that the IP address of the Guest OS (Sandbox in this case) is same as the machine it runs on. To enable effective communication, there is Port Forwarding in visualization software. The ports that are default configured for Hadoop in Sandbox are mapped to different (or sometime same) ports by default. You can check the port forwarding and the configured rules to reach to specific service/port from the Host operating system. Now, about the eclipse plugin for Hadoop 2.2.0, I am yet to be successful to find how to do it. So I will post more as I go along my Hadoop development journey.

  • About the eclipse plugin, would you mind sharing a link to your blog that show steps by steps how to set up? I got Cloudera installed which has Eclipse preinstalled (linux) but it would be a lot better if i can use eclipse directly from my host machine (windows 7). Thanks!! – ngunha02 Jan 22 '14 at 17:49
  • Jacky, Unfortunately I do not have solution yet on setting eclipse plugin for eclipse. To continue with my experimentation on Hadoop, I am using Maven packaging to compile and manually submitting the job using hadoop command line in my own 6 node virtual hadoop cluster. – Pulsating Taurus Feb 09 '14 at 08:57