0

I have webservice using websockets, and need to implement zero-downtime deployment. Because I don't want drop existing connections on deploy, I've decided to implement blue/green deploy. My actual solution looks like:

  1. I've created two identical services in portainer, listening on different ports. Every service has set in node environments some identifier, for example alfa and beta
  2. Both services are hidden behind load balancer, and balancer is periodically checking status of each service. If service responds on specific route (/balancer-keepalive-check) with string "OK", this service is active and balancer can routing to this service. If service is responding with string "STOP", balancer mark this service as inaccessible, but active connections will be preserved
  3. which service is active and which is stopped is synced over redis. In redis there are keys lb.service.alfa and lb.service.beta which can contains values 1 for active and 0 for inactive. Example of implementation /balancer-keepalive-check route in nestjs:
    import {Controller, Get} from '@nestjs/common';
    import {RedisClient} from "redis";
    const { promisify } = require("util");
    
    
    @Controller()
    export class AppController {
    
        private redisClient = new RedisClient({host: process.env.REDIS_HOST});
        private serviceId:string = process.env.ID;  //alfa, beta
    
        @Get('balancer-keepalive-check')
        async balancerCheckAlive(): Promise<string> {
            const getAsync = promisify(this.redisClient.get).bind(this.redisClient);
            return getAsync(`lb-status-${this.serviceId}`).then(status => {
                const reply: string = status == 1 ? 'OK' : 'STOP';
                return `<response>${reply}</response>`;
            })
        }
    }
  1. in gitlab CI create docker image tagged by tag on commit, and restart service calling portainer webhook for specific service. This works well for 1 service, but don't know how to use 2 different DEPLOY_WEBHOOK CI variables and switch between them.
image: registry.rassk.work/pokec/pokec-nodejs-build-image:p1.0.1
services:
  - name: docker:dind

variables:
  DOCKER_TAG: platform-websocket:$CI_COMMIT_TAG

deploy:
  tags:
    - dtm-builder
  environment:
    name: $CI_COMMIT_TAG
  script:
    - npm set registry http://some-private-npm-registry-url.sk
    - if [ "$ENV_CONFIG" ]; then cp $ENV_CONFIG $PWD/.env; fi
    - if [ "$PRIVATE_KEY" ]; then cp $PRIVATE_KEY $PWD/privateKey.pem; fi
    - if [ "$PUBLIC_KEY" ]; then cp $PUBLIC_KEY $PWD/publicKey.pem; fi
    - docker build -t $DOCKER_TAG .
    - docker tag $DOCKER_TAG registry.rassk.work/community/$DOCKER_TAG
    - docker push registry.rassk.work/community/$DOCKER_TAG
    - curl --request POST $DEPLOY_WEBHOOK
  only:
    - tags

My questions, which I don't know how to solve are:

  • When I have 2 services, I have 2 different deploy webhooks from which I need to call one after deploy, because I don't want to restart both services. How to determine which one? How to implement some kind of counter, if this deploy is to "alfa" or "beta" service? Should I use gitlab api and update DEPLOY_WEBHOOK after each deploy? Or shoud I get rid of this gitlab CI/CD variable and use some API on services which will tell me webhook url?
  • How to update values in redis? Should I implement custom API for this?
  • Exists there better way how to achieve this?

addition info: Can't use gitlab api from serviceses, because our gitlab is self-hosted on domain accessible only from our private network.

michal pavlik
  • 330
  • 3
  • 18
  • 1. Read about load balancers, no need to invent the wheel. 2. If you insist in implementing this yourself e.g. because only you have a legacy system where only 1 instance is allowed to be active, better to hold a single value in Redis of which is active and not multiple flags in order to reduce race conditions and use TTL on key to make sure one is up e.g. [implementation here](https://github.com/pub-comp/redis-context/blob/43b1678d9d74c52d999a9919775a45b23d6946ad/RedisRepo/RedisContext.cs#L924) – Danny Varod Nov 24 '21 at 12:47
  • this is not about load balancer. This part I have working. Try to read my questions one more time. This is about how to find which service need to restart after deploy from gitlab CI/CD – michal pavlik Nov 24 '21 at 12:50

1 Answers1

0

I've modified my AppController. There are 2 new endpoints now, one for identify which service is running, second for switch value in redis:

private serviceId:string = process.env.ID || 'alfa';

    @Get('running-service-id')
    info(){
        return this.serviceId
    }

    @Get('switch')
    switch(){
        const play = this.serviceId == 'alfa' ? `lb-status-beta` : `lb-status-alfa`;
        const stop = `lb-status-${this.serviceId}`;
        this.redisClient.set(play, '1', (err) => {
            if(!err){
                this.redisClient.set(stop, '0');
            }
        })
    }

after that, I modified my gitlab-ci.yml as follows:

image: registry.rassk.work/pokec/pokec-nodejs-build-image:p1.0.1
services:
  - name: docker:dind

stages:
  - build
  - deploy
  - switch

variables:
  DOCKER_TAG: platform-websocket:$CI_COMMIT_TAG

test:
  stage: build
  allow_failure: true
  tags:
    - dtm-builder
  script:
    - npm set registry http://some-private-npm-registry-url.sk
    - npm install
    - npm run test

build:
  stage: build
  tags:
    - dtm-builder
  environment:
    name: $CI_COMMIT_TAG
  script:
    - if [ "$ENV_CONFIG" ]; then cp $ENV_CONFIG $PWD/.env; fi
    - if [ "$PRIVATE_KEY" ]; then cp $PRIVATE_KEY $PWD/privateKey.pem; fi
    - if [ "$PUBLIC_KEY" ]; then cp $PUBLIC_KEY $PWD/publicKey.pem; fi
    - docker build -t $DOCKER_TAG .
    - docker tag $DOCKER_TAG registry.rassk.work/community/$DOCKER_TAG
    - docker push registry.rassk.work/community/$DOCKER_TAG
  only:
    - tags

deploy:
  stage: deploy
  needs: [build, test]
  environment:
    name: $CI_COMMIT_TAG
  script:
    - 'SERVICE_RUNNING=$(curl --request GET http://172.17.101.125/running-service-id)'
    - echo $SERVICE_RUNNING
    - if [ "$SERVICE_RUNNING" == "1" ]; then curl --request POST $DEPLOY_WEBHOOK_2; fi
    - if [ "$SERVICE_RUNNING" == "2" ]; then curl --request POST $DEPLOY_WEBHOOK_1; fi
  only:
    - tags

switch:
  stage: switch
  needs: [deploy]
  environment:
    name: $CI_COMMIT_TAG
  script:
    - sleep 10
    - curl --request GET http://172.17.101.125/switch
  only:
    - tags

In job build the docker image is build. After that runs job deploy, which make request to /running-service-id and identifies, which service is runing. Then deploy image to the stopped service. Last one is job switch, which will make request to /switch route, that will switch values in redis.

This works well. Last thing I need to implement is some kind of secret for this two routes (jwt for example)

michal pavlik
  • 330
  • 3
  • 18