I have some doubt regarding the transfer protocols being used by Hadoop framework to copy the mapper output(which is stored locally on mapper node) to the reducers task (which is not running on same node). - read some blogs that it uses HTTP for Shuffle phase - also read that HDFS data transfers(used by mapreduce jobs) are done using TCP/IP sockets directly. - read about RPC in Hadoop The Definitive guide.
Any pointers/reference will be of great help.