Which function sorts the output of Map task in the Reduce phase in Hadoop Src 2.7.1 and when does the sorting phase begin?
I want to know, which function in Hadoop is responsible for sorting the Map output and what is the sorting algorithm used?
Which function sorts the output of Map task in the Reduce phase in Hadoop Src 2.7.1 and when does the sorting phase begin?
I want to know, which function in Hadoop is responsible for sorting the Map output and what is the sorting algorithm used?
The map output is sorted using Quicksort technique during the spilling of intermediate KV (key-value) pair generated from Map tasks and it goes to the particular Reducer.
On the Reducer side, the KV pairs again get sorted using Merge sort technique and form the groups. Sorting is needed in the Reducer side, because the same intermediate KV pair may come from n-no.of Map tasks.