How combiner works when we use multiple inputs in Hadoop MapReduce

Question

I am implementing reduce side Join in Hadoop MapReduce(Java) for that purpose I am using multiple inputs, e.g there are two files Customers and Orders and I joined them considering cid(customer_id).

My Questions :

In the above program if I write combiner class how is it going to work, as far as I know combiner is mapper level aggregator, however in this case we have two mapper logics.
Will the combiner logic be applied to both mapper logics
Is there any way using which I can apply combiner logic to any one mapper logic

score 0 · Answer 1 · answered Mar 02 '21 at 06:01

Combiner aggregates mapper output and you can override it with any code you think is better. Combiner is known as a Mini-Reducer and inherits reducer class.

remember that combiner is not guaranteed to run in all cases, so your mapper output should always suffice as a reducer input.

and i dont get your question, despite whatever your mapper input is, mapper output will be some key-value data. combiner just aggregates or simply adds them up, say your mapper output is:

{'ali':2, 'jack':4, 'ali':3}

after combining your output will be:

{'ali':5, 'jack':4}

Yes of course, combiner does aggregate mappers output , however lets say I am implementing a reduce side join on two files using MR and in that case I have multiple mapper logics, so my question is How to implement combiner in such situation? — Sandeep Patil, Mar 29 '21 at 16:02

How combiner works when we use multiple inputs in Hadoop MapReduce

1 Answers1