0

Docker updates container but network registration takes 10 minutes to complete so while the new container is being registered the page returns 502 because internal network is still pointing at the old container. How can i delay the removal of the old container after the update to the new container by 10 minutes or so? Ideally I would like to push this config with docker stack but I'll do whatever it takes. I should also note that I am unable to use replicas right now due to certain limitations of a security package i'm being forced to use.

version: '3.7'
services:
  xxx:
    image: ${xxx}/com.xxx:${xxx}
    environment:
      - SERVICE_NAME=xxx
      - xxx
      - _xxx
      - SPRING_PROFILES_ACTIVE=${xxx}
    networks:
      - xxx${xxx}
    healthcheck:
      interval: 1m
    deploy:
      mode: replicated
      replicas: 1
      resources:
        limits:
          cpus: '3'
          memory: 1024M
        reservations:
          cpus: '0.50'
          memory: 256M
      labels:
        - com.docker.lb.hosts=xxx${_xxx}.xxx.com
        - jenkins.url=${xxx}
        - com.docker.ucp.access.label=/${xxx}/xxx
        - com.docker.lb.network=xxx${_xxx}
        - com.docker.lb.port=8080
        - com.docker.lb.service_cluster=${xxx}
        - com.docker.lb.ssl_cert=xxx.cert
        - com.docker.lb.ssl_key=xxx.key
        - com.docker.lb.redirects=http://xxx${_xxx}.xxx.com/xxx,https://xxx${_xxx}.xxx.com/xxx
      restart_policy:
        condition: any
        delay: 5s
        max_attempts: 3
        window: 120s
      update_config:
        parallelism: 1
        delay: 10s
        order: start-first
        failure_action: rollback
      rollback_config:
        parallelism: 0
        order: stop-first
    secrets:
      - ${xxx}

networks:
  xxx${_xxx}:
    external: true

secrets:
  ${xxx}:
    external: true
  xxx.cert:
    external: true
  xxx.key:
    external: true
IdiotDrake
  • 37
  • 9
  • Your 10min delay for network registration is a bit weird. How many containers are running in your swarm cluster ? – Marc ABOUCHACRA May 13 '20 at 19:27
  • not sure, a lot, there's 20 or so on the network alone and it's a smaller ingress of the hundred something networks. I should note I'm not a network guy, if the network needs changed I can make that suggestion but this is what I got to work with for now. – IdiotDrake May 13 '20 at 19:30
  • Then, if you have hundreds of network in the swarm cluster, I think that your delay is probably due to docker updating all iptable rules. However 10min seems a lot for * only * hundreds of network (but I guess it depends on the number of containers on each network) so this is just a guess... And, if i'm right and that's the case, then unforunatelly, you can't do anything to reduce that duration. – Marc ABOUCHACRA May 13 '20 at 19:41
  • This is what Kubernetes tries to solve with rolling updates. – tadman May 13 '20 at 19:42
  • @Marc well my network is a small one, some of the other networks have many more containers, hundreds, and more are made every day. Anyway I never thought I would be able to cut down on the registration time, I just thought I should be able to delay shutting down the old container after the new one is up – IdiotDrake May 13 '20 at 19:44
  • @tadman i have made that suggestion as well. the point is moot though with the limitation on replicas – IdiotDrake May 13 '20 at 19:46
  • Docker is great for simple configurations but as soon as you go down this path either you use Kubernetes or you end up painfully re-inventing it. – tadman May 13 '20 at 19:46
  • I just don't have 5 years to wait for 50 architects to agree to it. I'm just a developer with a problem to solve in the system they give me. – IdiotDrake May 13 '20 at 19:50

1 Answers1

0

Use proper healthcheck - see the reference here: https://docs.docker.com/compose/compose-file/#healthcheck

So:

  1. You need to define proper test to know when your new container is fully up (that goes inside test instruction of your healthcheck).
  2. Use start_period instruction to specify your 10 (or so) minute way - otherwise, Docker Swarm would just kill your new container and never let it start.

Basically, once you get healthcheck right, this should solve your issue.

taleodor
  • 1,849
  • 1
  • 13
  • 15
  • What do you mean? Healthcheck comes from yaml definition, in what you posted in the question you only have `healthcheck: interval: 1m` - which essentially does nothing. If you're overriding it with swarm cli, pls post the command and also inspect the service to make sure it's correct. – taleodor May 14 '20 at 15:23
  • maybe i was unclear, the problem is that running a curl or wget from the localhost is passed on creation of the new container, (while network is still pointing at the old container) then the old container is destroyed and the network is still pointing at it because it takes the 10 minutes or so for the network to discover the new (health) container – IdiotDrake May 14 '20 at 15:33
  • dockerfile config to generate command FROM myRepo ARG myJarFile EXPOSE myPort COPY myJarFile /newJarFile.jar ENTRYPOINT java -jar /elife-sal-v2.jar HEALTHCHECK --start-period=600s CMD wget -q -O /dev/null http://localhost:8080/path/actuator/health || exit 1 – IdiotDrake May 14 '20 at 15:36
  • I think your healthcheck instruction in yaml is essentially overriding your Dockerfile definition and you are left with no healthcheck in the end. Pls try to edit your yaml with full healthcheck (based on what you just posted above - with command and start-period, etc) and see if it fixes your issue. – taleodor May 14 '20 at 15:43
  • sorry, this is not the case. I'm open minded to being wrong so I tried your suggestion. The health check passes though, my issue is not with the new container not being deployed, I can confirm this. – IdiotDrake May 14 '20 at 15:55
  • Healthcheck passes on a container which is not ready? Look, i'm not going to continue this but container startup is controlled by healthcheck. It will be same if you move to k8s - you need to configure probes correctly there for this logic to work. – taleodor May 14 '20 at 17:36