Don't run the app container and Cassandra node inside of the same pod. You want to be able to scale your Cassandra cluster independently of your application.
For the Cassandra side of things, I suggest:
- A replication controller so you can easily scale your number of Cassandra nodes. Luckily for us, C* nodes are all the same.
- A Cassandra service so that your application pods have a stable endpoint at which they can talk to C*
- A headless Kubernetes service to provide your Cassandra pods with seed node IP addresses
You will need to have DNS working in your Kubernetes cluster.
The Cassandra Replication Controller
cassandra-replication-controller.yml
apiVersion: v1
kind: ReplicationController
metadata:
labels:
name: cassandra
name: cassandra
spec:
replicas: 1
selector:
name: cassandra
template:
metadata:
labels:
name: cassandra
spec:
containers:
- image: vyshane/cassandra
name: cassandra
env:
# Feel free to change the following:
- name: CASSANDRA_CLUSTER_NAME
value: Cassandra
- name: CASSANDRA_DC
value: DC1
- name: CASSANDRA_RACK
value: Kubernetes Cluster
- name: CASSANDRA_ENDPOINT_SNITCH
value: GossipingPropertyFileSnitch
# The peer discovery domain needs to point to the Cassandra peer service
- name: PEER_DISCOVERY_DOMAIN
value: cassandra-peers.default.cluster.local.
ports:
- containerPort: 9042
name: cql
volumeMounts:
- mountPath: /var/lib/cassandra/data
name: data
volumes:
- name: data
emptyDir: {}
The Cassandra Service
The Cassandra service is pretty simple. Add the thrift port if you need that.
cassandra-service.yml
apiVersion: v1
kind: Service
metadata:
labels:
name: cassandra
name: cassandra
spec:
ports:
- port: 9042
name: cql
selector:
name: cassandra
The Cassandra Peer Discovery Service
This is a headless Kubernetes service that provides the IP addresses of Cassandra peers via DNS A records. The peer service definition looks like this:
cassandra-peer-service.yml
apiVersion: v1
kind: Service
metadata:
labels:
name: cassandra-peers
name: cassandra-peers
spec:
clusterIP: None
ports:
- port: 7000
name: intra-node-communication
- port: 7001
name: tls-intra-node-communication
selector:
name: cassandra
The Cassandra Docker Image
We extend the official Cassandra image thus:
Dockerfile
FROM cassandra:2.2
MAINTAINER Vy-Shane Xie <shane@node.mu>
ENV REFRESHED_AT 2015-09-16
RUN apt-get -qq update && \
DEBIAN_FRONTEND=noninteractive apt-get -yq install dnsutils && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
COPY custom-entrypoint.sh /
ENTRYPOINT ["/custom-entrypoint.sh"]
CMD ["cassandra", "-f"]
Notice the custom-entrypoint.sh
script. It simply configures the seed nodes by querying our Cassandra peer discovery service:
custom-entrypoint.sh
#!/bin/bash
#
# Configure Cassandra seed nodes.
my_ip=$(hostname --ip-address)
CASSANDRA_SEEDS=$(dig $PEER_DISCOVERY_DOMAIN +short | \
grep -v $my_ip | \
sort | \
head -2 | xargs | \
sed -e 's/ /,/g')
export CASSANDRA_SEEDS
/docker-entrypoint.sh "$@"
Starting Cassandra
To start Cassandra, simply run
kubectl create -f cassandra-peer-service.yml
kubectl create -f cassandra-service.yml
kubectl create -f cassandra-replication-controller.yml
This will give you a one-node Cassandra cluster. To add another node:
kubectl scale rc cassandra --replicas=2
Talking to Cassandra
Your application pods can connect to Cassandra using the cassandra
hostname. It points to the Cassandra service.
Show me the code
I made a GitHub repo with the above setup: Multinode Cassandra Cluster on Kubernetes.