0

I have Cassandra state full set in Kubernetes - 4 pods: cassandra-0, cassandra-1, cassandra-2, cassandra-3. Sometimes I need shutdown all nodes and start all nodes - I can not do this one node by node but all at same time.

My clients loose connection and after restart they cannot reconnect since Cassandra Python driver want to reconnect same IP address (it not use DNS name to resolve new IP address of cassandra-0).

Why Cassandra driver not find host by DNS names but keep old IP addresses?

Should I add some load balancer, connection factory, ... class - how to configure Cassandra driver for Kubernetes and statics DNS addresses with dynamic IP addresses of database?

Let my connection will be like that (it is simplified for explanation):

    cluster = Cluster(
        contact_points=['cassandra-0.cassandra',
                        'cassandra-1.cassandra',
                        'cassandra-2.cassandra',
                        'cassandra-3.cassandra']
    )
    session = cluster.connect()

Full test code:

import logging
import random
import time

from cassandra import ConsistencyLevel
from cassandra.cluster import Cluster, Session, ExecutionProfile


def main():
    logging.basicConfig(level=logging.DEBUG)
    logger = logging.getLogger(__name__)
    cluster = Cluster(
        contact_points=['cassandra-0.cassandra',
                        'cassandra-1.cassandra',
                        'cassandra-2.cassandra',
                        'cassandra-3.cassandra']
    )

    cluster.add_execution_profile(
        'all',
        ExecutionProfile(
            consistency_level=ConsistencyLevel.ALL,
        )
    )

    cluster.add_execution_profile(
        'local_quorum',
        ExecutionProfile(
            consistency_level=ConsistencyLevel.LOCAL_QUORUM,
            serial_consistency_level=ConsistencyLevel.LOCAL_SERIAL,
        )
    )

    session = cluster.connect()

    result = session.execute('''
        create keyspace if not exists test
        with durable_writes = true
        and replication = {
            'class' : 'SimpleStrategy',
            'replication_factor' : 3
        }''', execution_profile='all')

    session.set_keyspace('test')

    result = session.execute('''
        create table if not exists test (
            test_id uuid,
            creation_date timestamp,
            
            primary key (test_id)
        )
    ''', execution_profile='all')

    while True:
        try:
            start_time = time.perf_counter()
            session.execute('''
            insert into test (
                test_id, creation_date
            )
            values (
                now(), toTimestamp(now())
            )
            ''', execution_profile='local_quorum')
            end_time = time.perf_counter()
        except Exception:
            logger.exception('Insert failed.')
        else:
            logger.debug('Message is sent in %s.', end_time - start_time)
        time.sleep(1.0 / random.randint(1, 5))


if __name__ == '__main__':
    main()

Code to setup Cassandra Kubernetes which I use for tests to show cluster architecture:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: local-path
provisioner: rancher.io/local-path
reclaimPolicy: Delete

---

apiVersion: v1
kind: Service
metadata:
  namespace: cassandra-stable-connection
  name: cassandra
  labels:
    app: cassandra
spec:
  clusterIP: None
  ports:
  - port: 7000
    name: intra-node
  - port: 7001
    name: tls-intra-node
  - port: 7199
    name: jmx
  - port: 9042
    name: cql
  selector:
    app: cassandra

---

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  namespace: cassandra-stable-connection
  name: cassandra
spec:
  selector:
    matchLabels:
      app: cassandra
  minAvailable: 3

---

apiVersion: apps/v1
kind: StatefulSet
metadata:
  namespace: cassandra-stable-connection
  name: cassandra
  labels:
    app: cassandra
spec:
  serviceName: cassandra
  replicas: 4
  selector:
    matchLabels:
      app: cassandra
  template:
    metadata:
      labels:
        app: cassandra
    spec:
      containers:
      - name: cassandra
        image: cassandra:4.0.8
        command:
        - /bin/bash
        - -c
        - |
          set -x
          hostname -f
          /usr/local/bin/docker-entrypoint.sh cassandra -f
        ports:
        - containerPort: 7000
          name: intra-node
        - containerPort: 7001
          name: tls-intra-node
        - containerPort: 7199
          name: jmx
        - containerPort: 9042
          name: cql
        securityContext:
          capabilities:
            add:
            - IPC_LOCK
        resources:
          limits:
            cpu: 1
            memory: 4Gi
        livenessProbe:
          tcpSocket:
            port: 7000
          initialDelaySeconds: 20
          failureThreshold: 20
          periodSeconds: 5
        readinessProbe:
          tcpSocket:
            port: 7000
          initialDelaySeconds: 20
          failureThreshold: 10
          periodSeconds: 5
        env:
        - name: MAX_HEAP_SIZE
          value: 1G
        - name: HEAP_NEWSIZE
          value: 100M
        - name: CASSANDRA_SEEDS
          value: "\
              cassandra-0.cassandra.cassandra-stable-connection.svc.cluster.local,\
              cassandra-1.cassandra.cassandra-stable-connection.svc.cluster.local,\
              cassandra-2.cassandra.cassandra-stable-connection.svc.cluster.local,\
              cassandra-3.cassandra.cassandra-stable-connection.svc.cluster.local\
              "
        - name: CASSANDRA_CLUSTER_NAME
          value: dptr-v2
        - name: CASSANDRA_DC
          value: dptr-v2-data-center-0
        - name: CASSANDRA_RACK
          value: dptr-v2-rack-0
        volumeMounts:
        - name: cassandra-data
          mountPath: /var/lib/cassandra
        - name: cassandra-data
          mountPath: /var/log/cassandra
  volumeClaimTemplates:
  - metadata:
      name: cassandra-data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: local-path
      resources:
        requests:
          storage: 8Gi

Chameleon
  • 9,722
  • 16
  • 65
  • 127
  • Refer to the similar issue on the [Lightbend discussion forum](https://discuss.lightbend.com/t/cassandra-is-down-and-when-it-is-up-services-are-not-checking-for-cassandra-connectivity/3899) & [Github link](https://github.com/akka/akka-persistence-cassandra/issues/445), which may help to resolve your issue. – Veera Nagireddy Mar 08 '23 at 10:37

0 Answers0