6

My current Kafka deployment file with 3 Kafka brokers looks like this:

apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: kafka
spec:
  selector:
    matchLabels:
      app: kafka
  serviceName: kafka-headless
  replicas: 3
  updateStrategy:
    type: RollingUpdate
  podManagementPolicy: Parallel
  template:
    metadata:
      labels:
        app: kafka
    spec:
      containers:
      - name: kafka-instance
        image: wurstmeister/kafka
        ports:
        - containerPort: 9092
        env:
        - name: KAFKA_ADVERTISED_PORT
          value: "9092"
        - name: KAFKA_ADVERTISED_HOST_NAME
          valueFrom:
              fieldRef:
                fieldPath: metadata.name
        - name: KAFKA_ZOOKEEPER_CONNECT
          value: "zookeeper-0.zookeeper-headless.default.svc.cluster.local:2181,\
                  zookeeper-1.zookeeper-headless.default.svc.cluster.local:2181,\
                  zookeeper-2.zookeeper-headless.default.svc.cluster.local:2181"
        - name: BROKER_ID_COMMAND
          value: "hostname | awk -F '-' '{print $2}'"
        - name: KAFKA_CREATE_TOPICS
          value: hello:2:1
        volumeMounts:
        - name: data
          mountPath: /var/lib/kafka/data
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 50Gi

This creates 3 Kafka brokers as a Stateful Set and connects to the Zookeeper cluster using the Kubedns service with FQDN (Fully Qualified Domain Names) such as:

zookeeper-0.zookeeper-headless.default.svc.cluster.local:2181

Broker IDs are generated based on the pod name:

- name: BROKER_ID_COMMAND
          value: "hostname | awk -F '-' '{print $2}'"

Result:

kafka-0 = 0
kafka-1 = 1
kafka-2 = 2

However, In order to use the Kubedns names for the Kafka brokers:

kafka-0.kafka-headless.default.svc.cluster.local:9092
kafka-1.kafka-headless.default.svc.cluster.local:9092
kafka-2.kafka-headless.default.svc.cluster.local:9092

I need to be able to set the KAFKA_ADVERTISED_HOST_NAME variable to the above FQDN values based on the name of the pod.

Currently I have the variable set to the name of the pod:

- name: KAFKA_ADVERTISED_HOST_NAME
   valueFrom:
      fieldRef:
        fieldPath: metadata.name

Result:

KAFKA_ADVERTISED_HOST_NAME=kafka-0
KAFKA_ADVERTISED_HOST_NAME=kafka-1
KAFKA_ADVERTISED_HOST_NAME=kafka-2

But somehow I would need to append the rest of the DNS name.

Is there a way I could set the DNS value directly?

Something like that:

- name: KAFKA_ADVERTISED_HOST_NAME
       valueFrom:
          fieldRef:
            fieldPath: kubedns.name
Dcompoze
  • 853
  • 1
  • 11
  • 19

3 Answers3

4

I managed to solve the problem with a command field inside the pod definition:

    command:
    - sh
    - -c
    - "export KAFKA_ADVERTISED_HOST_NAME=$(hostname).kafka-headless.default.svc.cluster.local &&
       start-kafka.sh"

This runs a shell command which exports the advertised hostname environment variable based on the hostname value.

Dcompoze
  • 853
  • 1
  • 11
  • 19
  • Could you please share whole deployment? – Dino L. Apr 02 '18 at 18:39
  • Nowadays, you can use `HOSTNAME_COMMAND` so you don't have to override the entrypoint. ([source](https://github.com/wurstmeister/kafka-docker/blob/master/README.md#injecting-hostname_command-into-configuration)) – Raphael Mar 19 '21 at 17:36
1
- name: MY_POD_NAME
  valueFrom:
    fieldRef:
      fieldPath: metadata.name 
- name: KAFKA_ZOOKEEPER_CONNECT
  value: zook-zookeeper.zook.svc.cluster.local:2181
- name: KAFKA_PORT_NUMBER
  value: "9092"
- name: KAFKA_LISTENERS
  value: SASL_SSL://:$(KAFKA_PORT_NUMBER)
- name: KAFKA_ADVERTISED_LISTENERS
  value: SASL_SSL://$(MY_POD_NAME).kafka-kafka-headless.kafka.svc.cluster.local:$(KAFKA_PORT_NUMBER)

The above config would create your FQDN. You should be able to see those names in your Kafka logs when Kafka server starts.

NOTE: Kubernetes allows you to reference environment variables using the syntax $(VARIABLE)

donhector
  • 875
  • 1
  • 10
  • 21
1

None of the above worked for me; my setup it wurstmeister/kafka:2.12-2.5.0 and wurstmeister/zookeeper:3.4.6 in a single pod on Kubernetes 1.16 (don't ask); ClusterIp service on top which forwards 9092 to the Kafka container.

This set of environment variables works:

- name: KAFKA_LISTENERS
  value: "INSIDE://:9094,OUTSIDE://:9092"
- name: KAFKA_ADVERTISED_LISTENERS
  value: "INSIDE://:9094,OUTSIDE://my-service.my-namespace.svc.cluster.local:9092"
- name: KAFKA_LISTENER_SECURITY_PROTOCOL_MAP
  value: "INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT" # not production-ready!
- name: KAFKA_INTER_BROKER_LISTENER_NAME
  value: INSIDE
- name: KAFKA_ZOOKEEPER_CONNECT
  value: "localhost:2181" # since it's in the same pod

Sources: wurstmeister/kafka doc, Kafka doc

The inherent problem seems to be that Kafka itself needs to be an IP-ish thing to bind to and to talk to itself via, while clients need a DNS-ish name to connect to from the outside. The latter one can't contain the pod name for some reason. (Might be a separate configuration issue on my end.)

Raphael
  • 9,779
  • 5
  • 63
  • 94
  • Thanks dear. Your solution removed my headache. I was facing problem like below `No security protocol defined for listener PLAINTEXT://some-ip-here:TCP` Following the config for listeners as you defined solved my problem. – Syed Muhammad Asad Sep 14 '21 at 12:37