I am trying to run multiple Map/Reduce tasks in Hadoop. After searching on google, I went with method 2 as described at http://cloudcelebrity.wordpress.com/2012/03/30/how-to-chain-multiple-mapreduce-jobs-in-hadoop/ : use JobControl. I got the following error:
/examples2/format/Dictionary.java:100: error: no suitable method found for addJob(org.apache.hadoop.mapreduce.Job)
jbcntrl.addJob(job);
^
method JobControl.addJob(org.apache.hadoop.mapred.jobcontrol.Job) is not applicable
(actual argument org.apache.hadoop.mapreduce.Job cannot be converted to org.apache.hadoop.mapred.jobcontrol.Job by method invocation conversion)
As described at Is it better to use the mapred or the mapreduce package to create a Hadoop Job?, there are two different API's, which seem to misaligned here. After looking further, I found JobControl and JofConf.setMapperClass() error. They say that using the mapreduce package org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl'
instead of
'org.apache.hadoop.mapred.jobcontrol.JobControl
should solve it. Only problem is: I am using this. When I take a look at this particular file (hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/jobcontrol/JobControl.java in the sourcecode), I see it is using
import org.apache.hadoop.mapred.jobcontrol.Job;
instead of
import org.apache.hadoop.mapreduce.Job;
Which seems to me to be causing the error (correct?). Is there any way, other than reverting code back to mapred, to get around this? Or any other way of running multiple M/R jobs?
Update: I got method 1 from http://cloudcelebrity.wordpress.com/2012/03/30/how-to-chain-multiple-mapreduce-jobs-in-hadoop/ to work, but I am still interested in the answer for the problem.