0

I'm trying to build a simple Hadoop mapreduce program and I chose Java for that job. I checked out the example codes around and tried to build myself. I created the following gradle script and when I looked at the installed dependencies, none had Mapper or Reducer. Not even org.apache.hadoop.mapreduce package.

group 'org.ardilgulez.demoprojects'
version '1.0-SNAPSHOT'

apply plugin: 'java'

repositories {
    mavenCentral()
}

dependencies {
    testCompile group: 'junit', name: 'junit', version: '4.11'
    compile group: 'org.apache.hadoop', name: 'hadoop-common', version:'2.7.3'
    compile group: 'org.apache.hadoop', name: 'hadoop-hdfs', version: '2.7.3'
    compile group: 'org.apache.hadoop', name: 'hadoop-yarn-common', version: '2.7.3'
    compile group: 'org.apache.hadoop', name: 'hadoop-minicluster', version: '2.7.3'
    compile group: 'org.apache.hadoop', name: 'hadoop-mapreduce-client-core', version:'2.7.3'
    compile group: 'org.apache.hadoop', name: 'hadoop-mapreduce-client-jobclient', version: '2.7.3'
    compile group: 'org.apache.hadoop', name: 'hadoop-mapreduce-client-app', version: '2.7.3'
    compile group: 'org.apache.hadoop', name: 'hadoop-mapreduce-client-shuffle', version: '2.7.3'
    compile group: 'org.apache.hadoop', name: 'hadoop-mapreduce-client-common', version: '2.7.3'
    compile group: 'org.apache.hadoop', name: 'hadoop-client', version: '2.7.3'
}

I know I'm not going to be needing at least 7 out of those 10 hadoop dependencies, but I don't know which of these dependencies have org.apache.hadoop.mapreduce package (I know these 11 don't though).

Which dependency/repository should I add so that I can actually build a mapreduce job?

Can I do that with raw org.apache.hadoop packages and not vendor packages (such as Cloudera)?

Thanks in advance for all the help.

Tunaki
  • 132,869
  • 46
  • 340
  • 423
ardilgulez
  • 1,856
  • 18
  • 19
  • It's late in the year 2018 and I can not find any documentation whatsoever on Hadoop's distributions. Their website has broken links all over the place and not even their most basic MapReduce "tutorial" touch on the topic of how to setup a project and which dependencies to rely on. It is extremely hard for me to understand how this project can be used by multinational multi-billion corporations and yet lack the most basic documentation. Apache Hadoop has become yet another reason for me to discourage my future children from becoming software developers. – Martin Andersson Nov 22 '18 at 16:47
  • Should be added that the closest I came was a list of [build artifacts](https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/DownstreamDev.html#Build_Artifacts). – Martin Andersson Nov 22 '18 at 16:53

3 Answers3

2

This should be the correct dependency:

compile 'org.apache.hadoop:hadoop-mapreduce-client-core:2.7.3'

Be sure to refresh your gradle project.

bblincoe
  • 2,393
  • 2
  • 20
  • 34
0

Add

compile group: 'org.apache.hadoop', name:'hadoop-core', version: '2.7.3'
Prashant_M
  • 2,868
  • 1
  • 31
  • 24
0

For newer versions of Gradle (I'm using 7.5.1) compile isn't an option anymore. In that case, you should use implementation.

dependencies {
    implementation 'org.apache.hadoop:hadoop-common:2.7.7'
    implementation 'org.apache.hadoop:hadoop-mapreduce-client-core:2.7.7'
}
DccBr
  • 1,211
  • 14
  • 14