1

I am running a program(MapReduce) in Hadoop single node cluster. There a few modifications to the MapReduce logic frequently. I am using eclipse IDE.

Every time after making a modification to the logic, I am creating a jar file to check the Mapreduce job in Hadoop. This is a little tedious to create a jar file every time after modifying the MapReduce logic.

Is there any easier way to create the jar file every time a change is made? Please advise.

rs_atl
  • 8,935
  • 1
  • 23
  • 28
user3370144
  • 99
  • 1
  • 5

3 Answers3

0

It's not clear whether the fact that you have to make a jar file is the issue, or whether the process of making the jar seems too difficult. First, you do have to make a jar file to submit a job to Hadoop. There is no way around this. Second, to make the process of creating the jar file easier, you can use a build tool like Maven to make this simpler. The Maven Assembly plugin will also package up an uber jar with all your dependencies if needed.

rs_atl
  • 8,935
  • 1
  • 23
  • 28
0

I use Scalding to write mapreduce jobs (it's as concise as Pig and as flexible/performant as Java), I then use sbt to build. I have an rsync script that syncs my code to a location on the cluster, where I startup the 'sbt concole' - it's a repl (shell) where you can import libraries.

The result is, that I can run mapreduce jobs interactively in a shell just by either calling my code from the project, writting the code directly into the shell or copy pasting code into the shell. IME you cannot beat this kind of workflow!

samthebest
  • 30,803
  • 25
  • 102
  • 142
0

If you are using Eclipse IDE, you can have MapReduce Plugin added to eclipse and create location having provided the port numbers of HDFS and MapReduce. So in eclipse you can simple right click and run > choose Run on hadoop, which will avoid creating of jar file.

chmk
  • 57
  • 1
  • 8