I am running three MapReduce jobs in sequence (output of one is the input to another) on a Hadoop cluster with 3 nodes (1 master and 2 slaves).
Apparently, the total time taken by individual jobs to finish on a single node cluster is less than the above by quite a margin. What could be the possible reasons? Is it the network latency? It's running on 100Mbps Ethernet network. Will it help if I increase the number of nodes?
I am using Hadoop Streaming and my code is in python2.7.