I'd like to set the # of reduce tasks to be exactly equal to the # of available reduce slots in one job.
By default the reduce tasks are being calculated as ~1.75 times the # of reduce slots available (on Elastic Mapreduce). I notice that my job completes reduce tasks very uniformly, so it will better to run 1 reducer per reduce slot in the job.
But how can I identify the cluster metrics from within my job configuration?