I write such code for MapReduce text sorting:
public static class SortMapper extends Mapper<Object, Text, Text, Text> {
private Text citizenship = new Text();
@Override
public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
citizenship.set(value.toString().split(",")[11]);
context.write(citizenship, value);
}
}
public static class PrintReducer extends Reducer<Text, Text, NullWritable, Text> {
@Override
protected void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
Iterator<Text> valIt = values.iterator();
while (valIt.hasNext()) {
Text value = valIt.next();
context.write(NullWritable.get(), value);
}
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "Football Sort");
job.setJarByClass(FootballSort.class);
job.setMapperClass(SortMapper.class);
job.setCombinerClass(PrintReducer.class);
job.setReducerClass(PrintReducer.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
job.setOutputKeyClass(NullWritable.class);
job.setOutputValueClass(Text.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
but it always catch
IOException in lines 26, 34 reason: class org.apache.hadoop.io.NullWritable is not class org.apache.hadoop.io.Text