2

I have a k8s deployment for a Redis cluster with 3 Sentinels replicas and 3 slaves. I am facing an error on one of the sentinel pods :

*** FATAL CONFIG FILE ERROR (Redis 6.2.3) ***
Reading the configuration file, at line 4
>>> 'sentinel monitor mymaster  6379 2'
Unrecognized sentinel configuration statement.

My sentinel manifest is as below :

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: sentinel
  namespace: demos
spec:
  serviceName: sentinel
  replicas: 3
  selector:
    matchLabels:
      app: sentinel
  template:
    metadata:
      labels:
        app: sentinel
    spec:
      initContainers:
      - name: config
        image: redis:6.2.3-alpine
        imagePullPolicy: "IfNotPresent"
        command: [ "sh", "-c" ]
        args:
          - |
            REDIS_PASSWORD=a-very-complex-password-here
            nodes=redis-0.redis,redis-1.redis,redis-2.redis

            for i in ${nodes//,/ }
            do
                echo "finding master at $i"
                MASTER=$(redis-cli --no-auth-warning --raw -h $i -a $REDIS_PASSWORD info replication | awk '{print $1}' | grep master_host: | cut -d ":" -f2)
                if [ "$MASTER" == "" ]; then
                    echo "no master found"
                    MASTER=
                else
                    echo "found $MASTER"
                    break
                fi
            done
            echo "sentinel monitor mymaster $MASTER 6379 2" >> /tmp/master
            echo "port 5000
            sentinel resolve-hostnames yes
            sentinel announce-hostnames yes
            $(cat /tmp/master)
            sentinel down-after-milliseconds mymaster 5000
            sentinel failover-timeout mymaster 60000
            sentinel parallel-syncs mymaster 1
            sentinel auth-pass mymaster $REDIS_PASSWORD
            " > /etc/redis/sentinel.conf
            cat /etc/redis/sentinel.conf
        volumeMounts:
        - name: redis-config
          mountPath: /etc/redis/
      containers:
      - name: sentinel
        image: redis:6.2.3-alpine
        imagePullPolicy: "IfNotPresent"
        command: ["redis-sentinel"]
        args: ["/etc/redis/sentinel.conf"]
        ports:
        - containerPort: 5000
          name: sentinel
        volumeMounts:
        - name: redis-config
          mountPath: /etc/redis/
        - name: data
          mountPath: /data
      volumes:
      - name: redis-config
        emptyDir: {}
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "cinder-csi"
      resources:
        requests:
          storage: 50Mi
---
apiVersion: v1
kind: Service
metadata:
  name: sentinel
  namespace: demos
spec:
  clusterIP: None
  ports:
  - port: 5000
    targetPort: 5000
    name: sentinel
  selector:
    app: sentinel

And the redis manifest :

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: redis
  namespace: demos
spec:
  serviceName: redis
  replicas: 3
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      initContainers:
      - name: config
        image: redis:6.2.3-alpine
        imagePullPolicy: "IfNotPresent"
        command: [ "sh", "-c" ]
        args:
          - |
            cp /tmp/redis/redis.conf /etc/redis/redis.conf
            
            echo "finding master..."
            MASTER_FDQN=`hostname  -f | sed -e 's/redis-[0-9]\./redis-0./'`
            if [ "$(redis-cli -h sentinel -p 5000 ping)" != "PONG" ]; then
              echo "master not found, defaulting to redis-0"

              if [ "$(hostname)" == "redis-0" ]; then
                echo "this is redis-0, not updating config..."
              else
                echo "updating redis.conf..."
                echo "slaveof $MASTER_FDQN 6379" >> /etc/redis/redis.conf
              fi
            else
              echo "sentinel found, finding master"
              MASTER="$(redis-cli -h sentinel -p 5000 sentinel get-master-addr-by-name mymaster | grep -E '(^redis-\d{1,})|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})')"
              echo "master found : $MASTER, updating redis.conf"
              echo "slaveof $MASTER 6379" >> /etc/redis/redis.conf
            fi
        volumeMounts:
        - name: redis-config
          mountPath: /etc/redis/
        - name: config
          mountPath: /tmp/redis/
      containers:
      - name: redis
        image: redis:6.2.3-alpine
        imagePullPolicy: "IfNotPresent"
        command: ["redis-server"]
        args: ["/etc/redis/redis.conf"]
        ports:
        - containerPort: 6379
          name: redis
        volumeMounts:
        - name: data
          mountPath: /data
        - name: redis-config
          mountPath: /etc/redis/
      volumes:
      - name: redis-config
        emptyDir: {}
      - name: config
        configMap:
          name: redis-config
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "cinder-csi"
      resources:
        requests:
          storage: 50Mi
---
apiVersion: v1
kind: Service
metadata:
  name: redis
  namespace: demos
spec:
  clusterIP: None
  ports:
  - port: 6379
    targetPort: 6379
    name: redis
  selector:
    app: redis

And from k8s I have :

kubectl -n demos get pods
NAME                                  READY   STATUS             RESTARTS   AGE
redis-0                               1/1     Running            0          3h3m
redis-1                               1/1     Running            0          23m
redis-2                               1/1     Running            0          23m
sentinel-0                            0/1     CrashLoopBackOff   40         3h3m
sentinel-1                            1/1     Running            0          126m
sentinel-2                            1/1     Running            0          8m33s

I have extracted logs for redis-2 also :

1:S 26 May 2022 13:58:46.924 # Server initialized
1:S 26 May 2022 13:58:46.925 * Ready to accept connections
1:S 26 May 2022 13:58:46.925 * Connecting to MASTER redis-0.redis.labs.svc.myserver-XXX:XXXX
1:S 26 May 2022 13:58:46.947 * MASTER <-> REPLICA sync started
1:S 26 May 2022 13:58:46.947 * Non blocking connect for SYNC fired the event.
1:S 26 May 2022 13:58:46.947 * Master replied to PING, replication can continue...
1:S 26 May 2022 13:58:46.951 * Partial resynchronization not possible (no cached master)
1:S 26 May 2022 13:58:46.954 * Full resync from master: 6b11d6e184b481e9112053er6575790dhjdj782:55557
1:S 26 May 2022 13:58:47.038 * MASTER <-> REPLICA sync: receiving 178 bytes from master to disk
1:S 26 May 2022 13:58:47.038 * MASTER <-> REPLICA sync: Flushing old data
1:S 26 May 2022 13:58:47.044 * MASTER <-> REPLICA sync: Loading DB in memory
1:S 26 May 2022 13:58:47.052 * Loading RDB produced by version 6.2.3
1:S 26 May 2022 13:58:47.052 * RDB age 1 seconds
1:S 26 May 2022 13:58:47.052 * RDB memory usage when created 1.89 Mb
1:S 26 May 2022 13:58:47.053 * MASTER <-> REPLICA sync: Finished with success
1:S 26 May 2022 13:58:47.053 * Background append only file rewriting started by pid 11
1:S 26 May 2022 13:58:47.091 * AOF rewrite child asks to stop sending diffs.
11:C 26 May 2022 13:58:47.091 * Parent agreed to stop sending diffs. Finalizing AOF...
11:C 26 May 2022 13:58:47.091 * Concatenating 0.00 MB of AOF diff received from parent.
11:C 26 May 2022 13:58:47.091 * SYNC append only file rewrite performed
11:C 26 May 2022 13:58:47.091 * AOF rewrite: 0 MB of memory used by copy-on-write
1:S 26 May 2022 13:58:47.154 * Background AOF rewrite terminated with success
1:S 26 May 2022 13:58:47.154 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
1:S 26 May 2022 13:58:47.154 * Background AOF rewrite finished successfully

I am not quite sure why the other replica for sentinel is failing.

What am I missing?

Golide
  • 835
  • 3
  • 13
  • 36
  • found anything ? – vinni_f Jul 06 '22 at 17:02
  • you can use "https://github.com/bitnami/charts/tree/main/bitnami/redis-cluster" or "https://github.com/bitnami/charts/tree/main/bitnami/redis-cluster", it is quite good, and no error on my tries – myuce Jan 06 '23 at 15:02

0 Answers0