0

We have this scenario.

We have 3/3 master/slave arch for Mesos.

Each sleeve is identical, 4GB RAM and 4 Core CPUs.

We have started 10 marathon Apps with 1core CPU and 1GB RAM. We started the containers, but not utilizing them, as per the system it's saying 97% CPU is free.

Now, we are trying to start an another container with a 3Core CPU and 2GB RAM.

Unfortunately, we are not able to start the container, as per the Mesos logs, it's saying that marathon has declined the offer, but all slave nodes are not doing anything. Marathon apps stayed in Deployment state itself.

If mesos is not able to allocate resources to the marathon app (If containers are not utilizing the resources), then what's the use of Docker integration here.

As per my understanding:

Once an offer is accepted by marathon app, even if docker is not using that resource, mesos is thinking like that resources are already utilizing by the app. But if the container is not utilizing any resources, mesos need to collect the available resources and allocate to next marathon application.

Instead of that once an offer is assigned to marathon App, Mesos is subtracting the allocated resources from the total resources.

We are not fully utilizing the Docker features in Mesos/Marathon.

Let me know any suggestions and answers.

Thank you

Rico
  • 58,485
  • 12
  • 111
  • 141
Rajiv Reddy
  • 153
  • 1
  • 9

2 Answers2

2

Mesos tracks "allocation" and not the actual usage. If your app is not doing anything, it doesn't mean it won't do anything in the next moment. That means, if your app requested 1 CPU, this CPU is reserved for the app.

Now, if you don't want to precisely estimate resources your app is using, you may want to look at oversubscription in Mesos. You must keep in mind though, that once oversubscribed resources are requested by the app, for which these resources have been allocated, apps using oversubscribed resources may be terminated.

Community
  • 1
  • 1
rukletsov
  • 1,041
  • 5
  • 7
1

Mesos/Marathon actually considers the allocated 10*(1GB + 1CPU), because that is the max your app(s) is allowed to use. And so yes your understanding is correct.

In my opinion you have at least 2 options

  1. Assign less resources to your tasks.
  2. There is actually an interesting new feature which seems to fit your use case: oversubscription which basically tries to utilize this difference between allocated and actual used resources.
js84
  • 3,676
  • 2
  • 19
  • 23
  • 1. We have a problem with Marathon here, we are running WordPress sites inside the containers, If we are allocating less resources, when we reach to max RAM usage (based on Allocation), Container starts to use SWAP, this scenario container taking a long time to respond. We configured health checks, because of the slow response of the container, Marathon is killing the old container and creating new container. @js84 – Rajiv Reddy Aug 25 '15 at 05:02
  • 2. @rukletsov @js84 If we are trying to use oversubscription, We can't create all the apps, using oversubscription, They clearly mentioned that if `If any resource used by a task or executor is revocable, the whole container is treated as a revocable container and can therefore be killed or throttled by the QoS Controller.` – Rajiv Reddy Aug 25 '15 at 05:09
  • Right, oversubscription is for best-effort tasks. If I understand your case correctly, your containers may use all allocated memory. If this is the case, you can't really improve utilization, because your tasks eventually need all resources available in the cluster. Does it make sense? – rukletsov Aug 25 '15 at 23:37