I am running out a minute issue.
I am trying to get different file for different keys from Reducer.
Partitioner
public class customPartitioner extends Partitioner<Text, NullWritable> implements
Configurable {
private Configuration configuration;
@Override
public Configuration getConf() {
return configuration;
}
public int getPartition(Text key, NullWritable value, int numPartitions) {
return Math.abs(key.hashCode()) % numPartitions;
}
}
And I set the following in my driver class
job0.setPartitionerClass(customPartitioner.class);
job0.setNumReduceTasks(5);
For reducer I have 5 keys
[3, 0, 5, 8, 12]
So I need to get 5 different files.
But once I run this code I am getting 5 part files but the results are not expected.
OUTPUT
Found 6 items
-rw-r--r-- 3 sreeveni root 0 2015-12-09 11:44 /OUT/Part/OUT/_SUCCESS
-rw-r--r-- 3 sreeveni root 0 2015-12-09 11:44 /OUT/Part/OUT/part-r-00000
-rw-r--r-- 3 sreeveni root 4 2015-12-09 11:44 /OUT/Part/OUT/part-r-00001
-rw-r--r-- 3 sreeveni root 0 2015-12-09 11:44 /OUT/Part/OUT/part-r-00002
-rw-r--r-- 3 sreeveni root 4 2015-12-09 11:44 /OUT/Part/OUT/part-r-00003
-rw-r--r-- 3 sreeveni root 3 2015-12-09 11:44 /OUT/Part/OUT/part-r-00004
In that 2 files are empty and the other contains
sreeveni@machine10:~$ hadoop fs -cat /OUT/Part/OUT/part-r-00001
3
8
sreeveni@machine10:~$ hadoop fs -cat /OUT/Part/OUT/part-r-00003
0
5
sreeveni@machine10:~$ hadoop fs -cat /OUT/Part/OUT/part-r-00004
12
Why 2 keys come under one file?
Am I doing any mistake in my code? Please help