gather different keys to the same reducer function - HADOOP

Question

I want to gather to the same reducer function all the values of the keys which have at least one integer in common. For example all the values that correspond to the key "1,2" and all the values that correspond to the key "2,3" must be always in the same reducer function because these two keys have the integer 2 in common.

In another way, I just want to change the "key equality condition" to another condition.

Is there a way to do this? Is it relevant with the Partitioner class or I have to do something completely different?

I use 1.2.1 hadoop version if this matters.

Thanks in advance!

Do you mean reducer function or reducer task? Because you can have only one reducer function per job...if you meant reducer task then yes partitioner function will help you...also remember all the keys with same value must go to the same reducer task — sethi, Jun 08 '14 at 20:13

score 0 · Answer 1 · answered Jun 09 '14 at 07:58

I have only one Reducer function per job, I agree with that. However, when I run hadoop as a simulation in NetBeans (not in distributed mode) it creates one reducer task for each unique key. For instance, If I have only 3 keys (k1,k2,k3) it will call the reduce function 3 times, one for each of these keys.

example:
Reducer: key=k1
values which correspond to k1
Reducer: key=k2
values which correspond to k2
Reducer: key=k3
values which correspond to k3

Therefore, the values which correspond to key k1 , can be accessed only from that reducer's task and the same happens for k2 and k3 values. What I want to do is to gather k1 and k2 to the same task(assuming that these two keys have something in common) so that I can access all these values (which correspond to k1 and k2 key) from only one reducer task.

In addition, I read this example and I thought that I understood it until I run it and I saw that it creates 2 reducer tasks again and not 3 which is the number of the age groups in the partitioner.

output example:
Reducer: female
Monica<tab>56<tab>92
Kristine<tab>38<tab>53
Alice<tab>23<tab>45
Nancy<tab>7<tab>98
Mary<tab>6<tab>93
Clara<tab>87<tab>72
Reducer: male
James<tab>34<tab>79
Jacob<tab>7<tab>23
Alex<tab>52<tab>69
Bob<tab>34<tab>89
Chris<tab>67<tab>97
Adam<tab>9<tab>37
Connor<tab>25<tab>27
Daniel<tab>78<tab>95

gather different keys to the same reducer function - HADOOP

1 Answers1