I am running Hive jobs on hadoop cluster. I just came to know to know that the performance will get improve/change if you just concentrate on different behavior mapper and reducer. But I haven't played with it until. Until no I just played with Hive and executing queries with default mapper and reducer?
As I know about the mapper and reducer I am worried that what value to be set the mapper and reducer so that performance will get vary. I also thinking that is it need to set to master node only or we have to set for all nodes?
Anyone who has idea related to this please explain me scenario about this.
Also what are the other parameters do we need to set while executing jobs?