Process result of Reducer before storing them

Question

I am trying to write a HBase MapReduce Job which will produce the top10 users of my HBase table.

with the following Reducer,

class Top10usersReducer extends Reducer<Text, IntWritable, Text, TreeMap<Text,IntWritable>  {

public static final byte[] CF = "infos".getBytes();
public static final byte[] COUNT = "count".getBytes();
static TreeMap<Text,IntWritable> map = new TreeMap<Text,IntWritable>();

public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {

        int sum = 0;

        for (IntWritable val : values) {
            sum += val.get();
        }
        map.put(key, new IntWritable(sum));
        context.write(null,map);

}
}

I had this output, each record is store in one line:

id11841=4,id11993=8,id12493=6,id12592=2,id12706=7,id12871=1,id12990=3,id13092=10,id13528=5,id13580=9

I would like to have this result:

id13092=10,id13580=9,id11993=8,id12706=7,id12493=6,id13528=5,id11841=4,id12990=3,id12592=2,id12871=1

Please, any idea on which process to add to the Reducer in order to achieve this goal?

See [here](http://stackoverflow.com/questions/8289508/sorting-by-value-in-hadoop-from-a-file) — Paul Samsotha, May 23 '14 at 12:23

score 0 · Answer 1 · answered May 25 '14 at 22:07

0

The MR job which is performed sorts the output by key. So if you want to sort the output of reducer by values, you need to write another MR job.

answered May 25 '14 at 22:07

Sudheer

421
3
9

Process result of Reducer before storing them

1 Answers1