What is the difference between a mapper and a map task? Similarly, a reducer and a reduce task? Also, how are number of mappers,maptasks,reducers,reducetasks determined during the execution of a mapreduce task? Give interrelationships between them if there is any.
Asked
Active
Viewed 1,259 times
1 Answers
1
Simply map task is an instance of Mapper. Mapper and reducer are methods in mapreduce jobs.
When we run a mapreduce job, number of map tasks spawned depends on the number blocks(number of blocks depend on input splits) in the input. However the number of reduce tasks can be specified in the mapreduce driver code. Either it can be specified by setting property mapred.reduce.tasks in the job configuration object or org.apache.hadoop.mapreduce.Job#setNumReduceTasks(int reducerCount);
method can be used.
In the old JobConf API setNumMapTasks()
method was there. But setNumMapTasks()
method is removed in the new API org.apache.hadoop.mapreduce.Job
with the intension of number of mappers should be calculated based on the input splits.

SachinJose
- 8,462
- 4
- 42
- 63
-
can number of mappers also be specified by using mapred.map.tasks and setNumMapTasks? – user3458106 Mar 25 '14 at 06:25
-
I have modified the answer – SachinJose Mar 25 '14 at 06:49
-
also,how can i check the actual number of mappers/reducers tasks running for a particular job? – user3458106 Mar 25 '14 at 06:52
-
1Number of map/reduce tasks will be there in Jobtracker's WebUI – SachinJose Mar 25 '14 at 07:00