4

I am trying to setup a multi broker kafka on a kubernetes cluster hosted in Azure. I have a single broker setup working. For the multi broker setup, currently I have an ensemble of zookeeper nodes(3) that manage the kafka service. I am deploying the kafka cluster as a replication controller with replication factor of 3. That is 3 brokers. How can I register the three brokers with Zookeeper such that they register different IP addresses with the Zookeeper?

I bring up my replication controller after the service is deployed and use the Cluster IP in my replication-controller yaml file to specify two advertised.listeners, one for SSL and another for PLAINTEXT. However, in this scenario all brokers register with the same IP and write to replicas fail. I don't want to deploy each broker as a separate replication controller/pod and service as scaling becomes an issue. I would really appreciate any thoughts/ideas on this.

Edit 1:

I am additionally trying to expose the cluster to another VPC in cloud. I have to expose SSL and PLAINTEXT ports for clients which I am doing using advertised.listeners. If I use a statefulset with replication factor of 3 and let kubernetes expose the canonical host names of the pods as host names, these cannot be resolved from an external client. The only way I got this working is to use/expose an external service corresponding to each broker. However, this does not scale.

Annu
  • 145
  • 1
  • 3
  • 10
  • Were you ever able to solve this? – Mike S. Dec 16 '17 at 12:12
  • Hi Mike, I have not solved the "ease of scalability" problem with kafka. I am using each broker in a statefulset(for a perisistent volume) with advertised host as ingress-IP-address(to my kubernetes cluster):port, with different ports for different brokers. This is tough to scale as new ports would need to be opened for more brokers and the advertised ports in broker.yaml need to be changed accordingly. – Annu Dec 20 '17 at 21:21

1 Answers1

1

Kubernetes has the concept of Statefulsets to solve these issues. Each instance of a statefulset has it's own DNS name so you can reference to each instance by a dns name.

This concept is described here in more detail. You can also take a look at this complete example:

apiVersion: v1
kind: Service
metadata:
  name: zk-headless
  labels:
    app: zk-headless
spec:
  ports:
  - port: 2888
    name: server
  - port: 3888
    name: leader-election
  clusterIP: None
  selector:
    app: zk
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: zk-config
data:
  ensemble: "zk-0;zk-1;zk-2"
  jvm.heap: "2G"
  tick: "2000"
  init: "10"
  sync: "5"
  client.cnxns: "60"
  snap.retain: "3"
  purge.interval: "1"
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: zk-budget
spec:
  selector:
    matchLabels:
      app: zk
  minAvailable: 2
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: zk
spec:
  serviceName: zk-headless
  replicas: 3
  template:
    metadata:
      labels:
        app: zk
      annotations:
        pod.alpha.kubernetes.io/initialized: "true"

    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: "app"
                    operator: In
                    values: 
                    - zk-headless
              topologyKey: "kubernetes.io/hostname"
      containers:
      - name: k8szk
        imagePullPolicy: Always
        image: gcr.io/google_samples/k8szk:v1
        resources:
          requests:
            memory: "4Gi"
            cpu: "1"
        ports:
        - containerPort: 2181
          name: client
        - containerPort: 2888
          name: server
        - containerPort: 3888
          name: leader-election
        env:
        - name : ZK_ENSEMBLE
          valueFrom:
            configMapKeyRef:
              name: zk-config
              key: ensemble
        - name : ZK_HEAP_SIZE
          valueFrom:
            configMapKeyRef:
                name: zk-config
                key: jvm.heap
        - name : ZK_TICK_TIME
          valueFrom:
            configMapKeyRef:
                name: zk-config
                key: tick
        - name : ZK_INIT_LIMIT
          valueFrom:
            configMapKeyRef:
                name: zk-config
                key: init
        - name : ZK_SYNC_LIMIT
          valueFrom:
            configMapKeyRef:
                name: zk-config
                key: tick
        - name : ZK_MAX_CLIENT_CNXNS
          valueFrom:
            configMapKeyRef:
                name: zk-config
                key: client.cnxns
        - name: ZK_SNAP_RETAIN_COUNT
          valueFrom:
            configMapKeyRef:
                name: zk-config
                key: snap.retain
        - name: ZK_PURGE_INTERVAL
          valueFrom:
            configMapKeyRef:
                name: zk-config
                key: purge.interval
        - name: ZK_CLIENT_PORT
          value: "2181"
        - name: ZK_SERVER_PORT
          value: "2888"
        - name: ZK_ELECTION_PORT
          value: "3888"
        command:
        - sh
        - -c
        - zkGenConfig.sh && zkServer.sh start-foreground
        readinessProbe:
          exec:
            command:
            - "zkOk.sh"
          initialDelaySeconds: 15
          timeoutSeconds: 5
        livenessProbe:
          exec:
            command:
            - "zkOk.sh"
          initialDelaySeconds: 15
          timeoutSeconds: 5
        volumeMounts:
        - name: datadir
          mountPath: /var/lib/zookeeper
      securityContext:
        runAsUser: 1000
        fsGroup: 1000
  volumeClaimTemplates:
  - metadata:
      name: datadir
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 20Gi
Lukas Eichler
  • 5,689
  • 1
  • 24
  • 43
  • Hi Lukas, thanks for the response. My concern however is that all the brokers(that are part of the replication-controller) register with the same advertised.listener host and port with the zookeeper. Will they register differently if they are part of a stateful-set? Even if they are still advertising the same host and port? I will try the approach nonetheless and let know! – Annu Aug 02 '17 at 16:48
  • Hi @Annu, this approach is a proven way to run a multi zookeeper and multi broker Kafka cluster on Kubernetes. You can take a look at https://github.com/kubernetes/charts/tree/master/incubator/kafka which shows a good default Kafka Deployment on Kubernetes with Helm. – Lukas Eichler Aug 02 '17 at 18:11
  • 1
    Hi @Luckas, I am still struggling with getting this to work with StatefulSet. I am using advertised.listeners to expose SSL and PLAINTEXT ports. My problem is that I need to be able to access the cluster over a different VPC in cloud. My understanding of kafka is that the producer/consumer needs to be able to directly connect(resolve) the brokers to read/write data to it. I don't want to use three different services to expose my brokers. But I don't see any other way of going about it. – Annu Aug 07 '17 at 19:01
  • 2
    Were you ever able to solve this? Facing similar situation and have ZK cluster using Stateful Set like answer above. Stateful Set works for spinning up Kafka cluster and used lifecycle hook postStart to set broker ids, but the pods wont start without setting advertised listener. Unsure how to get dynamic value into `env` via manifest config. – Mike S. Dec 16 '17 at 12:17
  • 2
    How are you setting your advertised hosts in the stateful set? Are you using the canonical names generated by kubernetes stateful sets? – Annu Dec 20 '17 at 21:27