2

I can't figure out why the following request won't start health checking in Marathon. The container is started but the status remains Deploying

{
  "id": "bridged-webapp",
  "cmd": "python3 -m http.server 8080",
  "cpus": 0.1,
  "mem": 64.0,
  "instances": 1,
  "container": {
     "type": "DOCKER",
     "docker": {
       "image": "python:3",
        "network": "BRIDGE",
        "portMappings": [
           { 
             "containerPort": 8080,
             "hostPort": 31313,
             "servicePort": 9000,
             "protocol": "tcp"
           }
        ]
      }
    },
  "healthChecks": [
    {
        "protocol": "COMMAND",
        "command": { "value": "echo 0" },
        "maxConsecutiveFailures": 3
    }
   ]
}

When I look at the logs of Marathon it basically just says that Health check has started but nothing more :

Aug 28 16:52:33 cnode2 marathon[11495]: [2015-08-28 16:52:33,603] INFO   Adding health check for app [/bridged-webapp] and version [2015-08-28T16:52:33.500Z]: [HealthCheck(Some(/),COMMAND,0,Some(Command(echo 0)),300 seconds,60 seconds,20 seconds,3,false)] (mesosphere.marathon.health.MarathonHealthCheckManager$$EnhancerByGuice$$d8828133:76)
Aug 28 16:52:33 cnode2 marathon[11495]: [INFO] [08/28/2015 16:52:33.604] [marathon-akka.actor.default-dispatcher-693] [akka://marathon/user/$kg] Starting health check actor for app [/bridged-webapp] and healthCheck [HealthCheck(Some(/),COMMAND,0,Some(Command(echo 0)),300 seconds,60 seconds,20 seconds,3,false)]
Aug 28 16:52:33 cnode2 marathon[11495]: [INFO] [08/28/2015 16:52:33.627] [marathon-akka.actor.default-dispatcher-694] [akka://marathon/user/MarathonScheduler/$a/DeploymentManager/4f3a1a8e-8934-441a-9c55-b7cf332893e2/$a] Successfully started 0 instances of /bridged-webapp
Aug 28 16:52:37 cnode2 marathon[11495]: [2015-08-28 16:52:37,080] INFO Received status update for task bridged-webapp.2f4399a5-4da5-11e5-b538-080027bb2503: TASK_RUNNING () (mesosphere.marathon.MarathonScheduler$$EnhancerByGuice$$b7a64e04:100)

In the UI the job health is gray meaning that Health checking is unknown.

What is really strange is that if I run the same job but without the container it works.

Any ideas ?

TeaBough
  • 165
  • 6

2 Answers2

1

Update: Turns out the command health check doesn't work with the docker executor.... A colleague opened already an issue for this: https://github.com/mesosphere/marathon/issues/2140 and we will at least update the documentation asap.

Thanks for discovering this!

js84
  • 3,676
  • 2
  • 19
  • 23
  • The problem is that it's not even giving me a red status, it's giving me an gray status which mean health check unknown. Moreover in their curl based example, if curl succeed it returns 0. – TeaBough Aug 31 '15 at 07:59
  • Even if https://issues.apache.org/jira/browse/MESOS-3136 says its working, with mesos 0.25.0 and marathon 0.11 it's not ... – TeaBough Oct 13 '15 at 13:20
0

Looks like the behaviour has changed in mesos 0.27.0. Now the COMMAND executes inside the docker container. I don't see it documented anywhere.

darshandzend
  • 396
  • 4
  • 12
  • Are you sure? It seems that the `mesos-health-checkl` binary is still used in the Mesos 0.28.x release: https://github.com/apache/mesos/blob/0.28.x/src/docker/executor.cpp#L332 – Balthazar Rouberol Aug 19 '16 at 09:57