6

We have Docker Swarm mode (17.09.0-ce) on 4 nodes. We are trying to deploy 10 services using docker stack deploy and docker-compose.yml. Each service has required memory and memory limit in docker-compose.yml.

Some services get killed:

$ docker service ps st_master_xwiki
ID                  NAME                    IMAGE                                         NODE                                      DESIRED STATE       CURRENT STATE          ERROR               PORTS
s900hx36b70d        st_master_xwiki.1       docker-stage.ipsoft.com/apollo-xwiki:master   dyn-10-140-175-140.rnd.cloud.ipsoft.com   Running             Running 3 hours ago
52gzwwyipky0         \_ st_master_xwiki.1   docker-stage.ipsoft.com/apollo-xwiki:master   dyn-10-140-175-123.rnd.cloud.ipsoft.com   Shutdown            Shutdown 3 hours ago

Container logs do not have anything:

# docker logs 0578be3e943d134ae71f38b8354d1b5319bcc8164502555844b5d046ba3dcd0f
Starting Jetty on port 4424, please wait...
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=196m; support was removed in 8.0
2017-10-11 16:52:44.732:INFO::main: Logging initialized @222ms
2017-10-11 16:52:44.915:INFO:oejs.Server:main: jetty-9.2.13.v20150730
2017-10-11 16:52:44.931:INFO:oejs.AbstractNCSARequestLog:main: Opened /apps/xwiki/data/logs/2017_10_11.request.log
2017-10-11 16:52:44.933:INFO:oejdp.ScanningAppProvider:main: Deployment monitor [file:/apps/xwiki/jetty/contexts/] at interval 0
2017-10-11 16:52:55.811:INFO:oejsh.ContextHandler:main: Started o.e.j.w.WebAppContext@7225790e{/xwiki,file:/apps/xwiki/webapps/xwiki/,AVAILABLE}{/xwiki}
2017-10-11 16:52:55.821:INFO:oejsh.ContextHandler:main: Started o.e.j.w.WebAppContext@70efdd18{/,file:/apps/xwiki/webapps/root/,AVAILABLE}{/root}
2017-10-11 16:52:55.844:INFO:oejs.ServerConnector:main: Started ServerConnector@3b11deb6{HTTP/1.1}{0.0.0.0:4424}
2017-10-11 16:52:56.077:INFO:oejs.ServerConnector:main: Started ServerConnector@41dc34c8{SSL-http/1.1}{0.0.0.0:4423}
2017-10-11 16:52:56.077:INFO:oejs.Server:main: Started @11568ms
2017-10-11 16:52:56.077:INFO:oxtjl.NotifyListener:main: ----------------------------------
2017-10-11 16:52:56.079:INFO:oxtjl.NotifyListener:main: Server started, you can now open http://0578be3e943d:4424/ in your browser to access your wiki.
2017-10-11 16:52:56.079:INFO:oxtjl.NotifyListener:main: ----------------------------------

Nothing in /var/log/messages, no OOM killer:

# grep 0578be3e943d134ae71f38b8354d1b5319bcc8164502555844b5d046ba3dcd0f /var/log/messages
#

docker inspect shows exit code 137, which actually means KILL:

# docker inspect 0578be3e943d134ae71f38b8354d1b5319bcc8164502555844b5d046ba3dcd0f

"State": {
   "Status": "exited",
   "Running": false,
   "Paused": false,
   "Restarting": false,
   "OOMKilled": false,
   "Dead": false,
   "Pid": 0,
   "ExitCode": 137,
   "Error": "",
   "StartedAt": "2017-10-11T16:52:44.496298307Z",
   "FinishedAt": "2017-10-11T17:35:10.077594101Z"
}

But what is killing containers? How do I inspect Docker Swarm mode SHUTDOWN state?

relgames
  • 1,356
  • 1
  • 16
  • 34
  • can you share your docker-compose.yml, if you use the volume without named volumes it never run the container – Jinna Balu Oct 17 '17 at 19:40
  • I have the same problem. For one service, the container is very often killed. I cannot find the reason. Probably, either the health check failed or the defined maximum memory has reached. But how can I find the reason, where is that logged?`I can only fix the problem, if I know what the problem was… :-( – Marc Wäckerlin Feb 12 '21 at 16:52
  • Facing this problem as well, still can't find why the container was shut down – applecider Nov 18 '21 at 08:55
  • We just switched to k8s, which has better error reporting. – relgames Nov 18 '21 at 12:05

0 Answers0