0


I need help with mapreduce job, my custom partitioner is never invoked. I checked everything million times, but no result. It used to work a while ago, I have no idea why now it isn't. Any help would be very appreicated.
I am adding the code (It doesn't work either for custom keys as input either for very easy cases).
Mapper outputs right values 100%, and then partitioner is skipped.

//import of libs
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.Partitioner;
import org.apache.hadoop.hbase.mapreduce.TableMapper;
import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
...

public class hbaseCountTest extends Configured implements Tool {
....

static class myMapper extends TableMapper<Text,Text> {
    @Override
    public void map(ImmutableBytesWritable rowKey,Result result, Context context) throws IOException {
        ...... //dropping some calculations
        context.write(new Text(gender), new Text(capsStr)); // everything is right here, checked.
    }

}

    public static class myPartitioner extends Partitioner<Text, Text> {
    @Override
    public int getPartition(Text key, Text value, int NumReduceTasks) {
//getPartitioner IS NEVER INVOKED
        System.out.println("partitioner started"); 
        String heur = value.toString().split(":")[0]; 
        int h = Integer.parseInt(heur);
        if (h<10) {
            ... return... //dropping some calculations
        } else if (h>9 && h<19) {
            ...
        } else 
            {
            ...
            }
    }

}

@Override
public int run(String[] arg0) throws Exception {
    Job job = Job.getInstance(getConf(), "jobName1");
    job.setNumReduceTasks(3);
    job.setJarByClass(getClass());
    Configuration conf = job.getConfiguration();
    HBaseConfiguration.merge(conf, HBaseConfiguration.create(conf));
    conf.addResource("/home/hadoop/Training/CDH4/hadoop-2.0.0-cdh4.0.0/conf/hadoop-local.xml");
    conf.addResource("/home/hadoop/Training/CDH4/hadoop-2.0.0-cdh4.0.0/conf/mapred-site.xml");
    FileSystem fs = FileSystem.get(getConf());
    if (fs.exists(new Path(arg0[0]))) {
        fs.delete(new Path(arg0[0]));
    }
    Scan scan = new Scan();
    scan.addColumn(toBytes(famName), toBytes(colNamePage));
    scan.addColumn(toBytes(famName), toBytes(colNameTime));
    scan.addColumn(toBytes(famName1), toBytes(colNameRegion));
    scan.addColumn(toBytes(famName1), toBytes(colNameGender));
    TableMapReduceUtil.initTableMapperJob(tableName, scan, myMapper.class, Text.class, Text.class, job);

    job.setPartitionerClass(myPartitioner.class);
    job.setReducerClass(myReducer.class);
    job.setOutputFormatClass(TextOutputFormat.class);
    TextOutputFormat.setOutputPath(job, new Path(arg0[0]));
    job.setOutputKeyClass(TextOutputFormat.class);
    job.setOutputValueClass(TextOutputFormat.class);
            job.setNumReduceTasks(3);
    return job.waitForCompletion(true)?0:1;
}

}

Thanks a lot in advance,
Alex

  • 2
    What are you looking at and saying that your partitioner is not being invoke? – rVr Mar 06 '14 at 12:34
  • I tried to debug in local mode, then in pseudo mode I look at system.out.println if it gives me the message in the console window in eclipse and by the result I see that partitioning doesn't happen. Btw launching the jar from command line with -partitioner option it works =/ I don't fathom why it doesn't work in hadoop – Alexander Komarov Mar 06 '14 at 12:46
  • hmm...that could be because conf is not picking up customer partioner...lets look more in driver class – rVr Mar 06 '14 at 12:50
  • What about launching the jar from command-line, without the -partitioner option? I had also similar issues with eclipse-plugin and everything worked fine with the jars. – vefthym Mar 06 '14 at 15:40
  • yes it works in the command line even without -partitioner option. however not in eclipse =/ besides it used to work in eclipse...maybe I am just missing something... – Alexander Komarov Mar 06 '14 at 17:39
  • @Alexander : Can you keep entire Partitioner code in try{}catch{} block and post the exception if you catch it? Post your partition method code completely. – Ravindra babu Jan 10 '16 at 16:52

1 Answers1

0

Try to set the number of reducers to any number greater than the number of unique keys.

Yahia
  • 1,209
  • 1
  • 15
  • 18