1

I want to run a bunch of applications or containers on a single machine. I would like to isolate usage of the following resources:

  1. CPU
  2. Memory
  3. I/O (network, disk, etc)

Ideally, I would like to achieve proportional usage of all resources, so that if some container(s) are idle, others can take advantage of them. Static reservations (e.g 10% per 10 apps) are not ideal.

I know we can do this for CPU, but I'm not sure if that generalizes to everything. Would appreciate detailed answers (not just use "tp / qdisc" / "iptables" for network).

Emil Condrea
  • 9,705
  • 7
  • 33
  • 52
sydraz
  • 653
  • 6
  • 12
  • 1
    > proportional usage of all resources < > Performance isolation < This hardly mix together. You either limit container resources (limits can overlap) either provide access to all resources and hope that schedulers inside Linux are fair enough (CFS, CFQ etc.) – myaut Apr 16 '15 at 20:04
  • Fair enough. Maybe I should clarify. What I mean is that if some container wants to use 4 cores, 2 Gb RAM, 10Mb/s disk and 20Mb/s network, it should be guaranteed that much but we should also scale up when there is nothing else running on the machine. – sydraz Apr 16 '15 at 20:12

1 Answers1

5

With control groups (cgroups) you can achieve resource isolation for:

  • CPU
  • Memory
  • Network
  • Disk

When two or more processes might use too much of a resource so the other ones will not get a fair chance, you can use cgroups to tell them: if you fight for the same resource one of you can not get more than 60% and other one no more than 30% and so on. If there is no race for the same resource, we have a single requester. He can use how much he wants until another process will try to use it.

Examples of I/O Throttling

Introduction to Linux Control Groups

Regarding scaling up when the machine is idle: if you use Completely Fair Scheduler (CFS), a cgroup can get more of the allocated CPU share if there are enough idle CPU cycles available in the system.

Redhat resource management guide:

When tasks in one cgroup are idle and are not using any CPU time, this left-over time is collected in a global pool of unused CPU cycles. Other cgroups are allowed to borrow CPU cycles from this pool

cpusets.txt documentation

And if a CPU run out of tasks in its runqueue, the CPU try to pull extra tasks from other busy CPUs to help them before it is going to be idle.

Of course it takes some searching cost to find movable tasks and/or idle CPUs, the scheduler might not search all CPUs in the domain every time. In fact, in some architectures, the searching ranges on events are limited in the same socket or node where the CPU locates, while the load balance on tick searches all.

For example, assume CPU Z is relatively far from CPU X. Even if CPU Z is idle while CPU X and the siblings are busy, scheduler can't migrate woken task B from X to Z since it is out of its searching range. As the result, task B on CPU X need to wait task A or wait load balance on the next tick. For some applications in special situation, waiting 1 tick may be too long.

Few other methods to achieve resource isolation: nice (used for easy tweaks), cpulimit - static resource allocation, when other CPUs are idle, shares are not borrowed to other processes.

Emil Condrea
  • 9,705
  • 7
  • 33
  • 52