I have Cassandra state full set in Kubernetes - 4 pods: cassandra-0, cassandra-1, cassandra-2, cassandra-3. Sometimes I need shutdown all nodes and start all nodes - I can not do this one node by node but all at same time.
My clients loose connection and after restart they cannot reconnect since Cassandra Python driver want to reconnect same IP address (it not use DNS name to resolve new IP address of cassandra-0).
Why Cassandra driver not find host by DNS names but keep old IP addresses?
Should I add some load balancer, connection factory, ... class - how to configure Cassandra driver for Kubernetes and statics DNS addresses with dynamic IP addresses of database?
Let my connection will be like that (it is simplified for explanation):
cluster = Cluster(
contact_points=['cassandra-0.cassandra',
'cassandra-1.cassandra',
'cassandra-2.cassandra',
'cassandra-3.cassandra']
)
session = cluster.connect()
Full test code:
import logging
import random
import time
from cassandra import ConsistencyLevel
from cassandra.cluster import Cluster, Session, ExecutionProfile
def main():
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)
cluster = Cluster(
contact_points=['cassandra-0.cassandra',
'cassandra-1.cassandra',
'cassandra-2.cassandra',
'cassandra-3.cassandra']
)
cluster.add_execution_profile(
'all',
ExecutionProfile(
consistency_level=ConsistencyLevel.ALL,
)
)
cluster.add_execution_profile(
'local_quorum',
ExecutionProfile(
consistency_level=ConsistencyLevel.LOCAL_QUORUM,
serial_consistency_level=ConsistencyLevel.LOCAL_SERIAL,
)
)
session = cluster.connect()
result = session.execute('''
create keyspace if not exists test
with durable_writes = true
and replication = {
'class' : 'SimpleStrategy',
'replication_factor' : 3
}''', execution_profile='all')
session.set_keyspace('test')
result = session.execute('''
create table if not exists test (
test_id uuid,
creation_date timestamp,
primary key (test_id)
)
''', execution_profile='all')
while True:
try:
start_time = time.perf_counter()
session.execute('''
insert into test (
test_id, creation_date
)
values (
now(), toTimestamp(now())
)
''', execution_profile='local_quorum')
end_time = time.perf_counter()
except Exception:
logger.exception('Insert failed.')
else:
logger.debug('Message is sent in %s.', end_time - start_time)
time.sleep(1.0 / random.randint(1, 5))
if __name__ == '__main__':
main()
Code to setup Cassandra Kubernetes which I use for tests to show cluster architecture:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: local-path
provisioner: rancher.io/local-path
reclaimPolicy: Delete
---
apiVersion: v1
kind: Service
metadata:
namespace: cassandra-stable-connection
name: cassandra
labels:
app: cassandra
spec:
clusterIP: None
ports:
- port: 7000
name: intra-node
- port: 7001
name: tls-intra-node
- port: 7199
name: jmx
- port: 9042
name: cql
selector:
app: cassandra
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
namespace: cassandra-stable-connection
name: cassandra
spec:
selector:
matchLabels:
app: cassandra
minAvailable: 3
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
namespace: cassandra-stable-connection
name: cassandra
labels:
app: cassandra
spec:
serviceName: cassandra
replicas: 4
selector:
matchLabels:
app: cassandra
template:
metadata:
labels:
app: cassandra
spec:
containers:
- name: cassandra
image: cassandra:4.0.8
command:
- /bin/bash
- -c
- |
set -x
hostname -f
/usr/local/bin/docker-entrypoint.sh cassandra -f
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
securityContext:
capabilities:
add:
- IPC_LOCK
resources:
limits:
cpu: 1
memory: 4Gi
livenessProbe:
tcpSocket:
port: 7000
initialDelaySeconds: 20
failureThreshold: 20
periodSeconds: 5
readinessProbe:
tcpSocket:
port: 7000
initialDelaySeconds: 20
failureThreshold: 10
periodSeconds: 5
env:
- name: MAX_HEAP_SIZE
value: 1G
- name: HEAP_NEWSIZE
value: 100M
- name: CASSANDRA_SEEDS
value: "\
cassandra-0.cassandra.cassandra-stable-connection.svc.cluster.local,\
cassandra-1.cassandra.cassandra-stable-connection.svc.cluster.local,\
cassandra-2.cassandra.cassandra-stable-connection.svc.cluster.local,\
cassandra-3.cassandra.cassandra-stable-connection.svc.cluster.local\
"
- name: CASSANDRA_CLUSTER_NAME
value: dptr-v2
- name: CASSANDRA_DC
value: dptr-v2-data-center-0
- name: CASSANDRA_RACK
value: dptr-v2-rack-0
volumeMounts:
- name: cassandra-data
mountPath: /var/lib/cassandra
- name: cassandra-data
mountPath: /var/log/cassandra
volumeClaimTemplates:
- metadata:
name: cassandra-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: local-path
resources:
requests:
storage: 8Gi