1

I am trying to deploy Apache Kafka (not Confluent Kafka) on docker containers and connect to it using kafka-python's producer and consumer api. The producer api and consumer api should be able to run outside the docker container.


The pipeline consists a zookeeper instance and 3 Kafka brokers, each residing in a separate container. The following is the docker-compose code which I use to initialize the containers.

services: 
zookeeper-1:
    container_name: zookeeper-1
    image: zookeeper:2.7.0
    build:
      context: ./zookeeper
    volumes:
    - ./config/zookeeper-1/zookeeper.properties:/kafka/config/zookeeper.properties

kafka-1:
    container_name: kafka-1
    image: kafka:2.7.0
    build: 
        context: .
    volumes:
    - ./config/kafka-1/server.properties:/kafka/config/server.properties
    - ./data/kafka-1/:/tmp/kafka-logs/

kafka-2:
    container_name: kafka-2
    image: kafka:2.7.0
    build: 
        context: .
    volumes:
    - ./config/kafka-2/server.properties:/kafka/config/server.properties
    - ./data/kafka-2/:/tmp/kafka-logs/

kafka-3:
    container_name: kafka-3
    image: kafka:2.7.0
    build: 
        context: .
    volumes:
    - ./config/kafka-3/server.properties:/kafka/config/server.properties
    - ./data/kafka-3/:/tmp/kafka-logs/

I am able to produce and consume message using cli within docker image.


I would like to be able to access it from outside the docker containers. I know that it requires modifying the docker-compose script by adding listeners but I am new to Docker and have not been able to correctly add the listeners.

Please suggest the additions that are required so that I may access Kafka from outside the containers, on the same machine.


P.S.: I have already visited most of the kafka docker container discussions on SO, including "Connect to Kafka running in Docker","Interact with kafka docker container from outside of docker host [duplicate]", and they do not answer my query.

TennisTechBoy
  • 101
  • 10
  • 1) Running multiple brokers in one machine doesn't improve anything and wastes your HDD lifespan and space 2) `build` and `image` shouldn't be used together. 3) What do your configurations currently looking like 4) You understand you need listeners, so what have you tried and what is unclear from the linked posts so that future readers can benefit from a centralized answer? – OneCricketeer Sep 28 '21 at 14:32

1 Answers1

3

You can add environmental variables to each Kafka build that allow internal DNS access via one port and external access via another. For example, for kafka-1 we can expose a port and add environmental variables that will make Kafka listen for connections via the localhost (i.e. Python producer running on your host machine):

kafka-1:
  container_name: kafka-1
  image: kafka:2.7.0
  build: 
    context: .
  volumes:
  - ./config/kafka-1/server.properties:/kafka/config/server.properties
  - ./data/kafka-1/:/tmp/kafka-logs/
ports:
  - 9092:9092
environment:
  KAFKA_ADVERTISED_HOST_NAME: kafka-1
  KAFKA_BROKER_ID: 1
  KAFKA_ZOOKEEPER_CONNECT: zookeeper-1:2181
  KAFKA_LISTENERS: PLAINTEXT://kafka-1:29092,PLAINTEXT_HOST://localhost:9092
  KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka-1:29092,PLAINTEXT_HOST://localhost:9092
  KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
  KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
  KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

I do see that you are doing a docker build so can't be sure of your configuration and setup, but you can add the same settings to your mounted configuration file server.properties instead if environmental variables are not suitable.

I'm also assuming zookeeper is serving on port 2181, and the above does not consider security requirements etc.

Kafka listeners use the same name they are connected to with, and when connection is made, it will only succeed if the name associated with that listener matches the name it was connected with. For example:

  • Kafka-python client connects to "localhost:9092" running on host machine
  • Kafka server relays "localhost" back as that is the name associated with 9092
  • Connection succeeds

However, this will not work:

  • Kafka-python client running remotely, connecting via "SERVERIP:9092" to host machine
  • Kafka server relays "localhost" back as that is the name associated with 9092
  • Connection fails as the names do not match

If you need remote access, change localhost to 0.0.0.0 in your Kafka configuration instead.

To use multiple brokers in this docker-compose setup, use something like 9092 for kafka-1, 9093 for kafka-2 and 9094 for kafka-3, and expose those ports respectively. Then your bootstrap servers are just:

localhost:9092, localhost:9093, localhost:9094

You can setup multiple listeners too, so you can have one port for localhost access, one for internal network DNS access (within docker-compose) and another for remote access, and change security settings for each type. This link provides more information on that.

clarj
  • 1,001
  • 4
  • 14