8

Suppose a yarn application has long-running tasks (running for 1 hour or longer). When a MR job starts, all cluster resources are blocked, at least until one container is finished, which sometimes can take a long time.

Is there a way to limit the number of simultaneously running containers? Something along the lines, e.g. map.vcores.max (per NM, or globally). So the other applications are not blocked.

Any ideas?

ps. Hadoop 2.3.0

Roman Nikitchenko
  • 12,800
  • 7
  • 74
  • 110
Ivan Balashov
  • 1,897
  • 1
  • 23
  • 33

3 Answers3

4

This behaviour/feature can be handled per framework level rather than in YARN.

In Mapreduce, mapreduce.job.running.map.limit and mapreduce.job.running.reduce.limit can be used to limit the simultaneously running containers.

In Tez, It can handled using the property tez.am.vertex.max-task-concurrency

Related Jira -
https://issues.apache.org/jira/browse/MAPREDUCE-5583
https://issues.apache.org/jira/browse/TEZ-2914

SachinJose
  • 8,462
  • 4
  • 42
  • 63
2

As far as I can see you cannot directly limit number of containers. This is only determined by resources. So the best you can do is to limit resources per application.

In accordance to Fair scheduler documentation you can assign your application to special queue. In this case you can receive configuration which is pretty close to your task - as you can limit memory or cores resource per queue.

Maybe you can switch to different scheduler or even implement custom one but I don't like this way as doing this you step out of well-tested environment and I don't think you really need to do so much work like custom implementation.

Roman Nikitchenko
  • 12,800
  • 7
  • 74
  • 110
  • Thanks. I was hoping for something simpler than setting up extra queues, but looks like this is the simplest it can get. – Ivan Balashov Jul 10 '14 at 13:54
  • BTW in some cases it might worth implementing simple scheduler logic with minimal extension of existing one. From first look I wasn't so optimistic about it but now I am somewhat in doubt. – Roman Nikitchenko Jul 29 '14 at 04:59
  • Also, if you use Spark, you should set FIFO as ordering policy, so driver + workers will be run together (else, you can have ton of Spark driver containers waiting for worker containers .....) – Thomas Decaux Jun 08 '17 at 14:09
0

If you are using resource pools, you can limit the number of applications that are running simultaneously in a single pool. While this is not quite what you asked for, it may prove useful.

If you are using Cloudera Manager, looking in the dynamic resource pool configuration. enter image description here

If not, checkout http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_system-admin-guide/content/setting_application_limits.html

which describes yarn.scheduler.capacity..maximum-applications

ewm
  • 330
  • 3
  • 16
  • 2
    Pay attention, this limits also pending application, so, if you set a small value, you could have some trouble if you submit too much application that should queue. – Thomas Decaux Jun 08 '17 at 14:07