Mesos-level health checks (MESOS_HTTP, MESOS_HTTPS, MESOS_TCP, and COMMAND) are locally executed by Mesos on the agent running the corresponding task and thus test reachability from the Mesos executor. Mesos-level health checks offer the following advantages over Marathon-level health checks:
Mesos-level health checks are performed as close to the task as possible, so they are are not affected by networking failures.
Mesos-level health checks are delegated to the agents running the tasks, so the number of tasks that can be checked can scale horizontally with the number of agents in the cluster.
Limitations and considerations
Mesos-level health checks consume extra resources on the agents; moreover, there is some overhead for fork-execing a process and entering the tasks’ namespaces every time a task is checked.
The health check processes share resources with the task that they check. Your application definition must account for the extra resources consumed by the health checks.
Mesos-level health checks require tasks to listen on the container’s loopback interface in addition to whatever interface they require. If you run a service in production, you will want to make sure that the users can reach it.
Marathon currently does NOT support the combination of Mesos and Marathon level health checks.
Example usage
HTTP:
{
"path": "/api/health",
"portIndex": 0,
"protocol": "HTTP",
"gracePeriodSeconds": 300,
"intervalSeconds": 60,
"timeoutSeconds": 20,
"maxConsecutiveFailures": 3,
"ignoreHttp1xx": false
}
or Mesos HTTP:
{
"path": "/api/health",
"portIndex": 0,
"protocol": "MESOS_HTTP",
"gracePeriodSeconds": 300,
"intervalSeconds": 60,
"timeoutSeconds": 20,
"maxConsecutiveFailures": 3
}
or secure HTTP:
{
"path": "/api/health",
"portIndex": 0,
"protocol": "HTTPS",
"gracePeriodSeconds": 300,
"intervalSeconds": 60,
"timeoutSeconds": 20,
"maxConsecutiveFailures": 3,
"ignoreHttp1xx": false
}
Note: HTTPS health checks do not verify the SSL certificate.
or TCP:
{
"portIndex": 0,
"protocol": "TCP",
"gracePeriodSeconds": 300,
"intervalSeconds": 60,
"timeoutSeconds": 20,
"maxConsecutiveFailures": 0
}
or COMMAND:
{
"protocol": "COMMAND",
"command": { "value": "curl -f -X GET http://$HOST:$PORT0/health" },
"gracePeriodSeconds": 300,
"intervalSeconds": 60,
"timeoutSeconds": 20,
"maxConsecutiveFailures": 3
}
{
"protocol": "COMMAND",
"command": { "value": "/bin/bash -c \\\"</dev/tcp/$HOST/$PORT0\\\"" }
}
Further Information: https://mesosphere.github.io/marathon/docs/health-checks.html