How to configure a RabbitMQ cluster in Kubernetes with a mounted persistent volume that will allow data to persist when the entire cluster restarts?

Question

I am trying to setup a high-availability RabbitMQ cluster of nodes in my Kubernetes cluster as a StatefulSet so that my data (e.g. queues, messages) persist even after restarting all of the nodes simultaneously. Since I'm deploying the RabbitMQ nodes in Kubernetes, I understand that I need to include an external persistent volume for the nodes to store data in so that the data will persist after a restart. I have mounted an Azure Files Share into my containers as a volume at the directory /var/lib/rabbitmq/mnesia.

When starting with a fresh (empty) volume, the nodes start up without any issues and successfully form a cluster. I can open the RabbitMQ management UI and see that any queue I create is mirrored on all of the nodes, as expected, and the queue (plus any messages in it) will persist as long as there is at least 1 active node. Deleting pods with kubectl delete pod rabbitmq-0 -n rabbit will cause the node to stop and then restart, and the logs show that it successfully syncs with any remaining/active node so everything is fine.

The problem I have encountered is that when I simultaneously delete all RabbitMQ nodes in the cluster, the first node to start up will have the persisted data from the volume and tries to re-cluster with the other two nodes which are, of course, not active. What I expected to happen was that the node would start up, load the queue and message data, and then form a new cluster (since it should notice that no other nodes are active).

I suspect that there may be some data in the mounted volume that indicates the presence of other nodes which is why it tries to connect with them and join the supposed cluster, but I haven't found a way to prevent that and am not certain that this is the cause.

There are two different error messages: one in the pod description (kubectl describe pod rabbitmq-0 -n rabbit) when the RabbitMQ node is in a crash loop and another in the pod logs. The pod description error output includes the following:

exited with 137: 20:38:12.331 [error] Cookie file /var/lib/rabbitmq/.erlang.cookie must be accessible by owner only

Error: unable to perform an operation on node 'rabbit@rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local'. Please see diagnostics information and suggestions below.

Most common reasons for this are:

 * Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
 * CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
 * Target node is not running

In addition to the diagnostics info below:

 * See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more
 * Consult server logs on node rabbit@rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local
 * If target node is configured to use long node names, don't forget to use --longnames with CLI tools

DIAGNOSTICS
===========

attempted to contact: ['rabbit@rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local']

rabbit@rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local:
  * connected to epmd (port 4369) on rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local
  * epmd reports: node 'rabbit' not running at all
                  no other nodes on rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local
  * suggestion: start the node

Current node details:
 * node name: 'rabbitmqcli-345-rabbit@rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local'
 * effective user's home directory: /var/lib/rabbitmq
 * Erlang cookie hash: xxxxxxxxxxxxxxxxx

and the logs output the following info:

Config file(s): /etc/rabbitmq/rabbitmq.conf

  Starting broker...2020-06-12 20:39:08.678 [info] <0.294.0> 
 node           : rabbit@rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local
 home dir       : /var/lib/rabbitmq
 config file(s) : /etc/rabbitmq/rabbitmq.conf
 cookie hash    : xxxxxxxxxxxxxxxxx
 log(s)         : <stdout>
 database dir   : /var/lib/rabbitmq/mnesia/rabbit@rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local

...

2020-06-12 20:48:39.015 [warning] <0.294.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,['rabbit@rabbitmq-2.rabbitmq-internal.rabbit.svc.cluster.local','rabbit@rabbitmq-1.rabbitmq-internal.rabbit.svc.cluster.local','rabbit@rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local'],[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]}
2020-06-12 20:48:39.015 [info] <0.294.0> Waiting for Mnesia tables for 30000 ms, 0 retries left
2020-06-12 20:49:09.341 [info] <0.44.0> Application mnesia exited with reason: stopped
2020-06-12 20:49:09.505 [error] <0.294.0> 
2020-06-12 20:49:09.505 [error] <0.294.0> BOOT FAILED
2020-06-12 20:49:09.505 [error] <0.294.0> ===========
2020-06-12 20:49:09.505 [error] <0.294.0> Timeout contacting cluster nodes: ['rabbit@rabbitmq-2.rabbitmq-internal.rabbit.svc.cluster.local',
2020-06-12 20:49:09.505 [error] <0.294.0>                                    'rabbit@rabbitmq-1.rabbitmq-internal.rabbit.svc.cluster.local'].

...

BACKGROUND
==========

This cluster node was shut down while other nodes were still running.
2020-06-12 20:49:09.506 [error] <0.294.0> 
2020-06-12 20:49:09.506 [error] <0.294.0> This cluster node was shut down while other nodes were still running.
2020-06-12 20:49:09.506 [error] <0.294.0> To avoid losing data, you should start the other nodes first, then
2020-06-12 20:49:09.506 [error] <0.294.0> start this one. To force this node to start, first invoke
To avoid losing data, you should start the other nodes first, then
start this one. To force this node to start, first invoke
"rabbitmqctl force_boot". If you do so, any changes made on other
cluster nodes after this one was shut down may be lost.

What I've tried so far is clearing the /var/lib/rabbitmq/mnesia/rabbit@rabbitmq-0.rabbitmq-internal.rabbit.svc.cluster.local/nodes_running_at_shutdown file contents, and fiddling with config settings such as the volume mount directory and erlang cookie permissions.

Below are the relevant deployment files and config files:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: rabbitmq
  namespace: rabbit
spec:
  serviceName: rabbitmq-internal
  revisionHistoryLimit: 3
  updateStrategy:
    type: RollingUpdate
  replicas: 3
  selector: 
    matchLabels:
          app: rabbitmq
  template:
    metadata:
      name: rabbitmq
      labels:
        app: rabbitmq
    spec:
      serviceAccountName: rabbitmq
      terminationGracePeriodSeconds: 10
      containers:        
      - name: rabbitmq
        image: rabbitmq:0.13
        lifecycle:
          postStart:
            exec:
              command:
                - /bin/sh
                - -c
                - >
                  until rabbitmqctl --erlang-cookie ${RABBITMQ_ERLANG_COOKIE} node_health_check; do sleep 1; done;
                  rabbitmqctl --erlang-cookie ${RABBITMQ_ERLANG_COOKIE} set_policy ha-all "" '{"ha-mode":"all", "ha-sync-mode": "automatic"}'
        ports:
        - containerPort: 4369
        - containerPort: 5672
        - containerPort: 5671
        - containerPort: 25672
        - containerPort: 15672
        resources:
          requests:
            memory: "500Mi"
            cpu: "0.4"
          limits:
            memory: "600Mi"
            cpu: "0.6"
        livenessProbe:
          exec:
            # Stage 2 check:
            command: ["rabbitmq-diagnostics", "status", "--erlang-cookie", "$(RABBITMQ_ERLANG_COOKIE)"]
          initialDelaySeconds: 60
          periodSeconds: 60
          timeoutSeconds: 15
        readinessProbe:
          exec:
            # Stage 2 check:
            command: ["rabbitmq-diagnostics", "status", "--erlang-cookie", "$(RABBITMQ_ERLANG_COOKIE)"]
          initialDelaySeconds: 20
          periodSeconds: 60
          timeoutSeconds: 10
        envFrom:
         - configMapRef:
             name: rabbitmq-cfg
        env:
          - name: HOSTNAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace
          - name: RABBITMQ_USE_LONGNAME
            value: "true"
          - name: RABBITMQ_NODENAME
            value: "rabbit@$(HOSTNAME).rabbitmq-internal.$(NAMESPACE).svc.cluster.local"
          - name: K8S_SERVICE_NAME
            value: "rabbitmq-internal"
          - name: RABBITMQ_DEFAULT_USER
            value: user
          - name: RABBITMQ_DEFAULT_PASS
            value: pass
          - name: RABBITMQ_ERLANG_COOKIE
            value: my-cookie
          - name: NODE_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
        volumeMounts:
        - name: my-volume-mount
          mountPath: "/var/lib/rabbitmq/mnesia"
      imagePullSecrets:
      - name: my-secret
      volumes:
        - name: my-volume-mount
          azureFile:
            secretName: azure-rabbitmq-secret
            shareName: my-fileshare-name
            readOnly: false
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: rabbitmq-cfg
  namespace: rabbit
data:
  RABBITMQ_VM_MEMORY_HIGH_WATERMARK: "0.6"
---
kind: Service
apiVersion: v1
metadata:
  namespace: rabbit
  name: rabbitmq-internal
  labels:
    app: rabbitmq
spec:
  clusterIP: None
  ports:
    - name: http
      protocol: TCP
      port: 15672
    - name: amqp
      protocol: TCP
      port: 5672
    - name: amqps
      protocol: TCP
      port: 5671
  selector:
    app: rabbitmq  
---
kind: Service
apiVersion: v1
metadata:
  namespace: rabbit
  name: rabbitmq
  labels:
    app: rabbitmq
    type: LoadBalancer
spec:
  selector:
    app: rabbitmq
  ports:
   - name: http
     protocol: TCP
     port: 15672
     targetPort: 15672
   - name: amqp
     protocol: TCP
     port: 5672
     targetPort: 5672
   - name: amqps
     protocol: TCP
     port: 5671
     targetPort: 5671

Dockerfile:

FROM rabbitmq:3.8.4
COPY conf/rabbitmq.conf /etc/rabbitmq
COPY conf/enabled_plugins /etc/rabbitmq

USER root
COPY conf/.erlang.cookie /var/lib/rabbitmq
RUN /bin/bash -c 'ls -ld /var/lib/rabbitmq/.erlang.cookie; chmod 600 /var/lib/rabbitmq/.erlang.cookie; ls -ld /var/lib/rabbitmq/.erlang.cookie'

rabbitmq.conf

## cluster formation settings
cluster_formation.peer_discovery_backend  = rabbit_peer_discovery_k8s
cluster_formation.k8s.host = kubernetes.default.svc.cluster.local
cluster_formation.k8s.address_type = hostname
cluster_formation.k8s.service_name = rabbitmq-internal
cluster_formation.k8s.hostname_suffix = .rabbitmq-internal.rabbit.svc.cluster.local
cluster_formation.node_cleanup.interval = 60
cluster_formation.node_cleanup.only_log_warning = true
cluster_partition_handling = autoheal
queue_master_locator=min-masters

## general settings
log.file.level = debug

## Mgmt UI secure/non-secure connection settings (secure not implemented yet)
management.tcp.port       = 15672

## RabbitMQ entrypoint settings (will be injected below when image is built)

Thanks in advance!

Unfortunately no, I was not able to find a solution to this problem. — Vivek Parmar, Mar 03 '21 at 19:41
Can you try to name each port in statefulset definition - https://stackoverflow.com/questions/45147664/how-to-expose-multiple-port-using-a-load-balancer-services-in-kubernetes ? — Malgorzata, Mar 10 '21 at 08:56
I did not. This project was abandoned shortly after making this post. @Malgorzata I appreciate the help but I don't intend to continue working on this. — Vivek Parmar, Aug 29 '22 at 13:22

How to configure a RabbitMQ cluster in Kubernetes with a mounted persistent volume that will allow data to persist when the entire cluster restarts?

0 Answers0