0

I am trying to get a zookeeper/kafka non-clustered setup to be able to talk to containers with python scripts. I want to be able to run a zookeeper/kafka container and 2 or more containers with python scripts communicating to the zookeeper/kafka, all running in containers or container groups on Azure.

To test this, I have created the below docker container group, with zookeeper and kafka as 2 services and a 3rd service that starts a simple python script to produce a steady pace of messages to a kafka topic. The docker-compose.yml that I am using is as follows:

version: '2'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:latest
    container_name: zookeeper
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000
    ports:
      - 22181:2181
    networks:
      - my-network

  kafka:
    image: confluentinc/cp-kafka:latest
    container_name: kafka
    depends_on:
      - zookeeper
    ports:
      - 29092:29092
    networks:
      - my-network
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
  kafka_producer:
    build: ../kafka_producer
    image: annabotkafka.azurecr.io/kafka_producer:v1
    container_name: kafka_producer
    depends_on:
      - kafka
    volumes:
      - .:/usr/src/kafka_producer
    networks:
      - my-network
    environment:
      KAFKA_SERVERS: kafka:9092
networks:
  my-network:
    driver: bridge

The kafka_producer.py script is as follows:

import os
from time import sleep
import json
from confluent_kafka import Producer

def acked(err, msg):
    if err is not None:
        print("Failed to deliver message: {0}: {1}"
              .format(msg.value(), err.str()))
    else:
        print("Message produced: {0}".format(msg.value()))

# Function to send a status message out on the status topic
def send_status(producer,counter):
    msg = {'counter':counter}
    json_dump = json.dumps(msg)
    producer.produce("counter", json_dump.encode('utf-8'), callback=acked)
    producer.poll()

# Define kafkaProducer to push messages to the status topic
producer = Producer({'bootstrap.servers': 'kafka:9092'})

for j in range(9999):
    print("Iteration", j)
    send_status(producer, j)
    sleep(2)

When I 'docker-compose up' this on my Ubuntu 20.04 dev machine, I get the expected behaviour: a stead stream of messages sent to the kafka producer.

After I 'docker-compuse push' this to Azure Container instances and create a container in Azure with the image, the kafka_producer script appears to no longer be able to connect to the kafka broker at kafka:9092.

These are the logs from the container group after startup:

Iteration 0
%3|1629363616.468|FAIL|rdkafka#producer-1| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Failed to resolve 'kafka:9092': Name or service not known (after 25ms in state CONNECT)
%3|1629363618.465|FAIL|rdkafka#producer-1| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Failed to resolve 'kafka:9092': Name or service not known (after 22ms in state CONNECT, 1 identical error(s) suppressed)
Iteration 1
Iteration 2

I had understood that the container group is on the same network subnet and on a single host so I would expect this to operate the same as on my dev machine locally.

My next step will be the have separate containers with different python scripts that I will want to communicate with kafka in this container group. Having the producer script within the same container group is not my longterm expectation, but I believed this simpler setup should work.

Any suggestions for where I am going wrong?

db533
  • 85
  • 1
  • 8
  • Where are you actually running Kafka in Azure? I highly doubt you're running it in a containe (or at least, you definitely shouldn't as its data isn't persistent). You should read this post anyway https://www.confluent.io/blog/kafka-listeners-explained/ – OneCricketeer Aug 19 '21 at 11:12
  • @OneCricketeer I am indeed running it in a container instance. In my use-case, kafka is providing asynchronous messaging between multiple scripts. The lack of persistence is not an issue in this case. I’ll check the link you shared shortly. – db533 Aug 19 '21 at 11:29
  • Sure, but my point is that you could/should run Kafka either in an actual, persistent VM instance / AKS, or use Event Hubs in Kafka mode – OneCricketeer Aug 19 '21 at 11:32

1 Answers1

1

From Azure documentation

Within a container group, container instances can reach each other via localhost on any port, even if those ports aren't exposed externally on the group's IP address or from the container.

This makes it sound like the containers are using a host network, not a Docker bridge like you've setup in Compose (where your code works fine)

Therefore, you ought to connect with localhost:29092

If you don't actually need message persistence, then I'd suggest using sockets via HTTP, gRPC or Zeromq between your scripts rather than a Kafka container

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • 1
    Thanks for the good feedback. Have been exploring Event Hubs and that looks to be a far better solution than creating a container with zookeeper and kafka. – db533 Aug 22 '21 at 17:56