2

We have a docker swarm stack with 36 services deployed on Azure Cloud with Docker 20.10.12 version, 7 nodes and 3 managers. Every time a deploy is done in the cluster, 3 of our services get restarted even if there are no updates for them.

The issue is not happening for the other services, so we want to understand why these 3 are restarted.

For all services including these we are using a explicitly tagged images(not latest) and 2 of the restarted services have a healthcheck, one of them doesn’t.

Inspecting the shutdown containers doesn’t provide the cause, container health looks ok until it seems to be shutdown by the swarm:

"State": { "Status": "exited", "Running": false, "Paused": false, "Restarting": false, "OOMKilled": false, "Dead": false, "Pid": 0, "ExitCode": 0, "Error": "", "StartedAt": "2022-06-28T11:07:50.253278648Z", "FinishedAt": "2022-06-28T12:09:38.028324252Z", "Health": { "Status": "unhealthy", "FailingStreak": 0, "Log": [ { "Start": "2022-06-28T12:07:31.823395345Z", "End": "2022-06-28T12:07:31.877107802Z", "ExitCode": 0, "Output": "" }...

Docker stack ps for one of the services: “CurrentState”:“Shutdown 2 hours ago”,“DesiredState”:“Shutdown”,“Error”:“”,“ID”:“986fpoefleoj” …

We’ve set docker logs to debug and I can see the running task desired state is set to shutdown and a new task is created but no reason why this happens:

new task ---> time="2022-06-28T12:08:26.336889162Z" level=debug msg="task kcu8p5kk29s7a3bqxaaifvqnm was marked pending allocation" module=node node.id=g9to4w40wx0met9q8h64h29b7 ... existing task---> time="2022-06-28T12:08:26.533482977Z" level=debug msg=assigned module=node/agent node.id=g9to4w40wx0met9q8h64h29b7 task.desiredstate=SHUTDOWN task.id=986fpoefleoj8mpnhu0eaop62 time="2022-06-28T12:08:26.533530078Z" level=debug msg=assigned module=node/agent node.id=g9to4w40wx0met9q8h64h29b7 task.desiredstate=READY task.id=kcu8p5kk29s7a3bqxaaifvqnm

Existing container is disabled and a new one is created:

DisableService 134bcd810301dad5ce535324960e66de085d3b5db07e4ac80cdef4fb4b2e6d69 START ... time="2022-06-28T12:08:45.247316912Z" level=debug msg="EnableService b00270822ac16280d40d162d34eed1b01a05af06debe6fd10e7bf90fd0cd1c7e START" ...

Found these posts that are somewhat similar but they don’t seem to have gotten an answer: https://forums.docker.com/t/why-does-docker-swarm-set-desired-status-to-shutdown/69379

How to investigate Docker Swarm Mode shutting down containers?

Any idea how we could find the cause why these services are restarted?

mirela
  • 21
  • 2
  • Some possible causes for this behavior have been discussed in this https://github.com/moby/moby/issues/31115 open (for years) ticket. I would appreciate it, if you would update this question with any cause or solution that you may find, as we also have seen this behavior. The only workaround we have applied so far is to split the deployment over several stack files. – JohannesB Jul 12 '23 at 12:20

0 Answers0