2

I've searched by sometime and I've found that a MapReduce cluster using hadoop2 + yarn has the following number of concurrent maps and reduces per node:

Concurrent Maps # = yarn.nodemanager.resource.memory-mb / mapreduce.map.memory.mb Concurrent Reduces # = yarn.nodemanager.resource.memory-mb / mapreduce.reduce.memory.mb

However, I've set up a cluster with 10 machines, with these configurations:

'yarn_site' => {
  'yarn.nodemanager.resource.cpu-vcores' => '32',
  'yarn.nodemanager.resource.memory-mb' => '16793',
  'yarn.scheduler.minimum-allocation-mb' => '532',
  'yarn.nodemanager.vmem-pmem-ratio' => '5',
  'yarn.nodemanager.pmem-check-enabled' => 'false'
},
'mapred_site' => {
  'mapreduce.map.memory.mb' => '4669',
  'mapreduce.reduce.memory.mb' => '4915',
  'mapreduce.map.java.opts' => '-Xmx4669m',
  'mapreduce.reduce.java.opts' => '-Xmx4915m'
}

But after the cluster is set up, hadoop allows 6 containers for the entire cluster. What am I forgetting? What am I doing wrong?

Luís Guilherme
  • 2,620
  • 6
  • 26
  • 41
  • Did you ever figure this one out, Luis? I believe the formula is more like the one from the Cloudera blog post linked to in my question -- http://stackoverflow.com/questions/25193201/how-to-set-the-precise-max-number-of-concurrently-running-tasks-per-node-in-hado -- but I find it's not quite right on EMR. – verve Aug 08 '14 at 11:51
  • 6 containers for a 10 machines cluster? That's weird. Are the same machine always empty of tasks? Have you're job enough mappers / reducers to launch? – ALSimon Dec 23 '14 at 16:53

1 Answers1

1

Not sure if this is the same issue you're having, but I had a similar issue, where I launched an EMR cluster of 20 nodes of c3.8xlarge in the core instance group and similarly found the cluster to be severely underutilized when running a job (only 30 mappers were running concurrently across the entire cluster, even though the memory/vcore configs in YARN and MapReduce for my particular cluster show that over 500 concurrent containers can run). I was using Hadoop 2.4.0 on AMI 3.5.0.

It turns out that the instance group matters for some reason. When I relaunched the cluster with 20 nodes in task instance group and only 1 core node, that made a HUGE difference. I got over 500+ mappers running concurrently (in my case, the mappers were mostly downloading files from S3 and as such don't need HDFS).

I'm not sure why the different instance group type makes a difference, given that both can equally run tasks, but clearly they are being treated differently.

I thought I'd mention it here, given that I ran into this issue myself and using a different group type helped.

Dia Kharrat
  • 5,948
  • 3
  • 32
  • 43