Why "Cassandra" uses "StatefulSet" instead of "Deploymdnt" file for "Kubernetes"?

Question

I am trying to deploy Cassandra on my local Kind cluster running on my Ubuntu 22.04 machine. The only instruction I found is this, that uses a StatefulSet for that. I am just wondering to know, isn't a Deployment file something newer? Why they didn't use Deployment file instead of StatefulSet? If it is better to use a Deployment file, can anybody help me to convert this StatefulSet to a Deployment file?

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: cassandra
  labels:
    app: cassandra
spec:
  serviceName: cassandra
  replicas: 3
  selector:
    matchLabels:
      app: cassandra
  template:
    metadata:
      labels:
        app: cassandra
    spec:
      terminationGracePeriodSeconds: 1800
      containers:
      - name: cassandra
        image: gcr.io/google-samples/cassandra:v13
        imagePullPolicy: Always
        ports:
        - containerPort: 7000
          name: intra-node
        - containerPort: 7001
          name: tls-intra-node
        - containerPort: 7199
          name: jmx
        - containerPort: 9042
          name: cql
        resources:
          limits:
            cpu: "500m"
            memory: 1Gi
          requests:
            cpu: "500m"
            memory: 1Gi
        securityContext:
          capabilities:
            add:
              - IPC_LOCK
        lifecycle:
          preStop:
            exec:
              command: 
              - /bin/sh
              - -c
              - nodetool drain
        env:
          - name: MAX_HEAP_SIZE
            value: 512M
          - name: HEAP_NEWSIZE
            value: 100M
          - name: CASSANDRA_SEEDS
            value: "cassandra-0.cassandra.default.svc.cluster.local"
          - name: CASSANDRA_CLUSTER_NAME
            value: "K8Demo"
          - name: CASSANDRA_DC
            value: "DC1-K8Demo"
          - name: CASSANDRA_RACK
            value: "Rack1-K8Demo"
          - name: POD_IP
            valueFrom:
              fieldRef:
                fieldPath: status.podIP
        readinessProbe:
          exec:
            command:
            - /bin/bash
            - -c
            - /ready-probe.sh
          initialDelaySeconds: 15
          timeoutSeconds: 5
        # These volume mounts are persistent. They are like inline claims,
        # but not exactly because the names need to match exactly one of
        # the stateful pod volumes.
        volumeMounts:
        - name: cassandra-data
          mountPath: /cassandra_data

  # These are converted to volume claims by the controller
  # and mounted at the paths mentioned above.
  # do not use these in production until ssd GCEPersistentDisk or other
  # ssd pd
  volumeClaimTemplates:
  - metadata:
      name: cassandra-data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: fast
      resources:
        requests:
          storage: 1Gi
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: fast
provisioner: k8s.io/minikube-hostpath
parameters:
  type: pd-ssd

score 1 · Accepted Answer · answered Dec 29 '22 at 22:22

1

A StatefulSet is different from a Deployment. From the documentation:

Like a Deployment, a StatefulSet manages Pods that are based on an identical container spec. Unlike a Deployment, a StatefulSet maintains a sticky identity for each of their Pods. These pods are created from the same spec, but are not interchangeable: each has a persistent identifier that it maintains across any rescheduling.

You use StatefulSets when your pods need to maintain some sort of unique state -- for example, the volumeClaimTemplates section of the manifest means that each pod gets a unique PersistentVolumeClaim. This isn't possible using a Deployment.

In general you cannot convert a StatefulSet into a Deployment unless you only plan on having a single replica.

answered Dec 29 '22 at 22:22

larsks

43,623
14
121
180

Thank you so much for clarifying. But for `mysql` for example, it also needs a persistent volume memory, but still they use `deployment` file for it as you can see here https://kubernetes.io/docs/tasks/run-application/run-single-instance-stateful-application/ .... What is the difference? – best_of_man Dec 30 '22 at 00:02
The title of that page is "Run a Single-Instance Stateful Application". That reflects what I said in my answer: you can only use a Deployment to run stateful application if you only have a single replica. – larsks Dec 30 '22 at 00:47
So this is not specific to `cassandra` and even for `mysql` we need a `StatefulSet` instead of `deployment` file if we want to have more than one `mysql` instance. – best_of_man Dec 30 '22 at 00:51
1

That's correct (because each mysql instance will e.g. need its own volume for data, which you can do with a statefulset but not with a deployment). – larsks Dec 30 '22 at 01:14
I tried to also use `StatefulSet` for deploying `mysql` on multi-nodes and I think I did it sucessfully, but I am interested to know why do I see `READY 2/2` when I try `kubectl get pods`? Because for `cassandra clusters I see `READY 1/1`. This is the first time I see `2/2`. What are they? – best_of_man Dec 30 '22 at 03:37
You might want to open a new question. Include both the output of `kubectl get pods` and the YAML manifest used to create the pods. – larsks Dec 30 '22 at 07:38
I opened a new question and also asked my previous question there: https://serverfault.com/questions/1119154/crashloopbackoff-while-deploying-mysql-on-multi-node-cluster – best_of_man Dec 30 '22 at 20:01

Why "Cassandra" uses "StatefulSet" instead of "Deploymdnt" file for "Kubernetes"?

1 Answers1