I'm trying to use the persistent volumes support for Mesos, and am having a tremendously difficult time getting it to work.
I've configured each of my slaves, as follows, and have confirmed that they've successfully rebooted using this new config:
/etc/mesos-slave/resources
[
{
"name" : "disk",
"type" : "SCALAR",
"scalar" : { "value" : 4194304 },
"disk" : {
"source" : {
"type" : "PATH",
"path" : { "root" : "/mnt/disk1" }
}
}
},
{
"name" : "disk",
"type" : "SCALAR",
"scalar" : { "value" : 4194304 },
"disk" : {
"source" : {
"type" : "PATH",
"path" : { "root" : "/mnt/disk2" }
}
}
},
{
"name" : "disk",
"type" : "SCALAR",
"scalar" : { "value" : 4194304 },
"disk" : {
"source" : {
"type" : "PATH",
"path" : { "root" : "/mnt/disk3" }
}
}
},
{
"name" : "disk",
"type" : "SCALAR",
"scalar" : { "value" : 4194304 },
"disk" : {
"source" : {
"type" : "PATH",
"path" : { "root" : "/mnt/disk4" }
}
}
},
{
"name" : "disk",
"type" : "SCALAR",
"scalar" : { "value" : 4194304 },
"disk" : {
"source" : {
"type" : "PATH",
"path" : { "root" : "/mnt/disk5" }
}
}
},
{
"name" : "disk",
"type" : "SCALAR",
"scalar" : { "value" : 4194304 },
"disk" : {
"source" : {
"type" : "MOUNT",
"mount" : { "root" : "/mnt/disk6" }
}
}
},
{
"name" : "disk",
"type" : "SCALAR",
"scalar" : { "value" : 4194304 },
"disk" : {
"source" : {
"type" : "MOUNT",
"mount" : { "root" : "/mnt/disk7" }
}
}
}
]
It shows, specifically, that I have unreserved resources. Specifically (full response here):
{
...
"slaves": [{
"id": "c5e59876-5157-463f-b31e-16b34d6ffc72-S8",
"pid": "slave(1)@172.30.31.55:5051",
"hostname": "redacted47.redacted.com",
"registered_time": 1458810586.61153,
"resources": {
"cpus": 32,
"disk": 29360128,
"mem": 256651,
"ports": "[31000-32000]"
},
"used_resources": {
"cpus": 1,
"disk": 0,
"mem": 128,
"ports": "[31282-31282]"
},
"offered_resources": {
"cpus": 0,
"disk": 0,
"mem": 0
},
"reserved_resources": {},
"unreserved_resources": {
"cpus": 32,
"disk": 29360128,
"mem": 256651,
"ports": "[31000-32000]"
},
Whenever I try to submit a job to it that requests a persistent volume, all of the slaves reject it, claiming that there are no disk resource available:
Mar 26 17:59:43 redacted47.redacted.com start[30457]: [2016-03-26 17:59:43,606] INFO Offer [2220b6bf-aac2-402b-82e6-8d625284d1a4-O9375]. Considering unreserved resources with roles {*}. Not all basic resources satisfied: cpus SATISFIED (1.0 <= 1.0), mem SATISFIED (128.0 <= 128.0), disk including volumes NOT SATISFIED (1024.0 > 0.0) (mesosphere.mesos.ResourceMatcher$:marathon-akka.actor.default-dispatcher-38)
Mar 26 17:59:43 redacted47.redacted.com start[30457]: [2016-03-26 17:59:43,606] INFO Offer [2220b6bf-aac2-402b-82e6-8d625284d1a4-O9376]. Considering unreserved resources with roles {*}. Not all basic resources satisfied: cpus SATISFIED (1.0 <= 1.0), mem SATISFIED (128.0 <= 128.0), disk including volumes NOT SATISFIED (1024.0 > 0.0) (mesosphere.mesos.ResourceMatcher$:marathon-akka.actor.default-dispatcher-38)
Mar 26 17:59:43 redacted47.redacted.com start[30457]: [2016-03-26 17:59:43,606] INFO Finished processing 2220b6bf-aac2-402b-82e6-8d625284d1a4-O9375. Matched 0 ops after 1 passes. disk(*) 4194304.0; disk(*) 4194304.0; disk(*) 4194304.0; disk(*) 4194304.0; disk(*) 4194304.0; disk(*) 4194304.0; disk(*) 4194304.0; cpus(*) 28.0; mem(*) 226955.0; ports(*) 31000->31085,31087->31364,31366->31940,31942->32000 left. (mesosphere.marathon.core.matcher.manager.impl.OfferMatcherManagerActor:marathon-akka.actor.default-dispatcher-11)
Mar 26 17:59:43 redacted47.redacted.com start[30457]: [2016-03-26 17:59:43,606] INFO Offer [2220b6bf-aac2-402b-82e6-8d625284d1a4-O9379]. Considering unreserved resources with roles {*}. Not all basic resources satisfied: cpus SATISFIED (1.0 <= 1.0), mem SATISFIED (128.0 <= 128.0), disk including volumes NOT SATISFIED (1024.0 > 0.0) (mesosphere.mesos.ResourceMatcher$:marathon-akka.actor.default-dispatcher-38)
If I try to post a request to create a volume directly against the mesos master, then it rejects the request, saying "Insufficient disk resources", as follows:
# curl -v -i \
-u "marathon:$(cat /etc/marathon/.secret)" \
-d slaveId=c5e59876-5157-463f-b31e-16b34d6ffc72-S8 \
-d volumes='[
{
"name": "disk",
"type": "SCALAR",
"scalar": { "value": 512 },
"role": "foo",
"reservation": {
"principal": "marathon"
},
"disk": {
"persistence": {
"id" : "very-persist"
},
"volume": {
"mode": "RW",
"container_path": "such-path"
}
}
}
]' \
-X POST http://localhost:5050/master/create-volumes; echo
* About to connect() to localhost port 5050 (#0)
* Trying ::1...
* Connection refused
* Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 5050 (#0)
* Server auth using Basic with user 'marathon'
> POST /master/create-volumes HTTP/1.1
> Authorization: Basic redacted
> User-Agent: curl/7.29.0
> Host: localhost:5050
> Accept: */*
> Content-Length: 481
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 481 out of 481 bytes
< HTTP/1.1 409 Conflict
HTTP/1.1 409 Conflict
< Date: Thu, 24 Mar 2016 09:50:36 GMT
Date: Thu, 24 Mar 2016 09:50:36 GMT
< Content-Length: 53
Content-Length: 53
<
* Connection #0 to host localhost left intact
Invalid CREATE Operation: Insufficient disk resources
I'm at wits end. I don't know what I'm doing and I'm trying my best to follow the documentation. Any hint as to what I might be doing wrong would be greatly, tremendously appreciated.
I'm running:
- Mesos
0.28.0
- Marathon
1.0.0RC1
I'm following the instructions from the following resources, as best as I can:
- https://mesosphere.github.io/marathon/docs/persistent-volumes.html
- http://mesos.apache.org/documentation/latest/persistent-volume/
- http://mesos.apache.org/documentation/latest/multiple-disk/
Thank you for reading!