1

i'm trying to test a cluster's riliability with 3 kafka node 3 zookeeper and one kafka-connect, disconnecting and reconnecting some node in a random way. The problem is happen after reconnect a node that cause a "Broker may not be available" because the port of the reconnected node is not available. If i try to restart the container of the node kafka return to work correctly.

The reconnect problem happen sometimes (is not regular) and the kafka container remain up but disconnected from the net.

This is my docker-compose file:

version: "2.0"
services:
    postgres:
        container_name: postgres
        ports:
            - '5432:5432'
        environment:
            - POSTGRES_USER=postgres
            - POSTGRES_PASSWORD=postgres
            - POSTGRES_DB=shipment_db
            - PGPASSWORD=password
        image: 'debezium/postgres:13'

    postgres-dest:
        container_name: postgres-dest
        ports:
            - '5433:5432'
        environment:
            - POSTGRES_USER=postgres
            - POSTGRES_PASSWORD=postgres
            - POSTGRES_DB=shipment_db
            - PGPASSWORD=password
        image: 'debezium/postgres:13'

    zookeeper-1:
        container_name: zookeeper-1
        ports:
            - '22181:2181'
        image: 'confluentinc/cp-zookeeper:5.4.6'
        environment:
            ZOOKEEPER_SERVER_ID: 1
            ZOOKEEPER_CLIENT_PORT: 22181
            ZOOKEEPER_TICK_TIME: 2000
            ZOOKEEPER_INIT_LIMIT: 5
            ZOOKEEPER_SYNC_LIMIT: 2
            KAFKA_JMX_PORT: 39999
            JMX_PORT: 39999
            ZOOKEEPER_SERVERS: "0.0.0.0:22888:23888;zookeeper-2:32888:33888;zookeeper-3:42888:43888"
            KAFKA_OPTS: "-Dzookeeper.4lw.commands.whitelist=*"

    zookeeper-2:
        container_name: zookeeper-2
        ports:
            - '32181:2181'
        image: 'confluentinc/cp-zookeeper:5.4.6'
        environment:
            ZOOKEEPER_SERVER_ID: 2
            ZOOKEEPER_CLIENT_PORT: 32181
            ZOOKEEPER_TICK_TIME: 2000
            ZOOKEEPER_INIT_LIMIT: 5
            ZOOKEEPER_SYNC_LIMIT: 2
            KAFKA_JMX_PORT: 39999
            JMX_PORT: 39999
            ZOOKEEPER_SERVERS: "zookeeper-1:22888:23888;0.0.0.0:32888:33888;zookeeper-3:42888:43888"
            KAFKA_OPTS: "-Dzookeeper.4lw.commands.whitelist=*"
        depends_on:
            - zookeeper-1

    zookeeper-3:
        container_name: zookeeper-3
        ports:
            - '42181:2181'
        image: 'confluentinc/cp-zookeeper:5.4.6'
        environment:
            ZOOKEEPER_SERVER_ID: 3
            ZOOKEEPER_CLIENT_PORT: 42181
            ZOOKEEPER_TICK_TIME: 2000
            ZOOKEEPER_INIT_LIMIT: 5
            ZOOKEEPER_SYNC_LIMIT: 2
            KAFKA_JMX_PORT: 39999
            JMX_PORT: 39999
            ZOOKEEPER_SERVERS: "zookeeper-1:22888:23888;zookeeper-2:32888:33888;0.0.0.0:42888:43888"
            KAFKA_OPTS: "-Dzookeeper.4lw.commands.whitelist=*"
        depends_on:
            - zookeeper-2

    kafka-1:
        container_name: kafka-1
        ports:
            - '29092:9092'
        image: confluentinc/cp-kafka:7.0.1 # debezium/kafka:1.8.0.Final #'debezium/kafka:1.7'
        environment:
            KAFKA_BROKER_ID: 1
            KAFKA_ZOOKEEPER_CONNECT: zookeeper-1:22181,zookeeper-2:32181,zookeeper-3:42181
            KAFKA_LISTENERS: INTERNAL://kafka-1:9092,OUTSIDE://0.0.0.0:29092
            KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka-1:9092,OUTSIDE://kafka-1:29092
            KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,OUTSIDE:PLAINTEXT
            KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
            KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 3   # For group coordinator
                                                        # https://stackoverflow.com/questions/42015158/what-is-the-difference-in-kafka-between-a-consumer-group-coordinator-and-a-consu
            #KAFKA_JMX_PORT: 49999
            #JMX_PORT: 49999
        depends_on:
            - zookeeper-1
            - zookeeper-2
            - zookeeper-3
        restart: always

    kafka-2:
        container_name: kafka-2
        ports:
            - '39092:9092'
        image: confluentinc/cp-kafka:7.0.1 # debezium/kafka:1.8.0.Final #'debezium/kafka:1.7'
        environment:
            KAFKA_BROKER_ID: 2
            KAFKA_ZOOKEEPER_CONNECT: zookeeper-1:22181,zookeeper-2:32181,zookeeper-3:42181
            KAFKA_LISTENERS: INTERNAL://kafka-2:9092,OUTSIDE://0.0.0.0:39092
            KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka-2:9092,OUTSIDE://kafka-2:39092
            KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,OUTSIDE:PLAINTEXT
            KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
            KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 3
            #KAFKA_JMX_PORT: 49999
            #JMX_PORT: 49999
        depends_on:
            - zookeeper-1
            - zookeeper-2
            - zookeeper-3
            - kafka-1
        restart: always

    kafka-3:
        container_name: kafka-3
        ports:
            - '49092:9092'
        image: confluentinc/cp-kafka:7.0.1 # debezium/kafka:1.8.0.Final #'debezium/kafka:1.7'
        environment:
            KAFKA_BROKER_ID: 3
            KAFKA_ZOOKEEPER_CONNECT: zookeeper-1:22181,zookeeper-2:32181,zookeeper-3:42181
            KAFKA_LISTENERS: INTERNAL://kafka-3:9092,OUTSIDE://0.0.0.0:49092
            KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka-3:9092,OUTSIDE://kafka-3:49092
            KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,OUTSIDE:PLAINTEXT
            KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
            KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 3
            #KAFKA_JMX_PORT: 49999
            #JMX_PORT: 49999
        depends_on:
            - zookeeper-1
            - zookeeper-2
            - zookeeper-3
            - kafka-2
        restart: always

    connect:
        image: confluentinc/cp-kafka-connect:7.0.1 #debezium/connect:1.8.0.Final # debezium/connect:1.7
        hostname: connect
        container_name: connect
        ports:
            - 8083:8083
        environment:
            CONNECT_BOOTSTRAP_SERVERS: kafka-1:9092,kafka-2:9092, kafka-3:9092
            CONNECT_GROUP_ID: 1
            CONNECT_CONFIG_STORAGE_TOPIC: my_connect_configs
            CONNECT_OFFSET_STORAGE_TOPIC: my_connect_offsets
            CONNECT_STATUS_STORAGE_TOPIC: my_connect_statuses
            CONNECT_BOOTSTRAP_SERVERS: kafka-1:9092,kafka-2:9092,kafka-3:9092
            CONNECT_GROUP_ID: connect-cluster-A
            CONNECT_PLUGIN_PATH: /kafka/data, /kafka/connect
            CONNECT_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
            CONNECT_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter
            CONNECT_INTERNAL_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
            CONNECT_INTERNAL_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter
            CONNECT_REST_ADVERTISED_HOST_NAME: localhost   
            #EXTERNAL_LIBS_DIR: /kafka/external_libs,/kafka/data
            CLASSPATH: /kafka/data/postgresql-42.2.19.jar # si può utilizzare anche *
            #KAFKA_CONNECT_PLUGINS_DIR: /kafka/data, /kafka/connect # old version
            CONNECT_PLUGIN_PATH: /kafka/data, /kafka/connect
            #CONNECT_LOG4J_LOGGERS: "org.apache.kafka.connect=DEBUG,org.apache.plc4x.kafka.Plc4xSinkConnector=DEBUG"
            CONNECT_LOG4J_ROOT_LOGLEVEL: DEBUG
        volumes:
            - type: bind
              source: ./plugins
              target: /kafka/data
        depends_on:
            - zookeeper-1
            - zookeeper-2
            - zookeeper-3
            - kafka-1
            - kafka-2
            - kafka-3
            - postgres
        links:
            - zookeeper-1
            - zookeeper-2
            - zookeeper-3
            - kafka-1
            - kafka-2
            - kafka-3
            - postgres
            - postgres-dest



    ksqldb-server:
        image: confluentinc/ksqldb-server:0.23.1
        hostname: ksqldb-server
        container_name: ksqldb-server
        depends_on:
            - kafka-1
            - kafka-2
            - kafka-3
            - zookeeper-1
            - zookeeper-2
            - zookeeper-3
        ports:
            - "8088:8088"
        volumes:
            - "./confluent-hub-components/:/usr/share/kafka/plugins/"
        environment:
            KSQL_LISTENERS: "http://0.0.0.0:8088"
            KSQL_BOOTSTRAP_SERVERS: "kafka-1:9092,kafka-2:9092,kafka-3:9092"
            KSQL_KSQL_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
            KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
            KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
            KSQL_KSQL_CONNECT_URL: http://connect:8083


I'm working with last version of docker (20.10.12) and last version of docker compose (1.29.2).

Do you have some suggestions? thanks in advance.

Al3
  • 183
  • 1
  • 3
  • 11
  • What exactly is throwing the error? You don't need more than one Zookeeper here, and 3 brokers on one Docker host doesn't address reliability – OneCricketeer Jan 12 '22 at 17:20
  • this is a "first step".. the next is move a nodes to different machines.. do you mean that the problem is the singular container? There is no a specific error, the problem is that the node, after the disconnect, not show the 9092 port. The only solution is restart the container. – Al3 Jan 12 '22 at 18:30
  • I'm asking which container shows "Broker may not be available". You've got like 10 containers here that all have their own logs, 4 of which try to connect to at least one broker. What exactly is the order of execution between you stopping a broker, seeing an error (in which service?), and then you restarting things? – OneCricketeer Jan 12 '22 at 22:34
  • There is a not an exact order.. sometimes happen that kafka-1 say "Broker may be not be available" sometimes is kafka-2, sometimes kafka-3.. for example i try to disconnect kafka-1 and kafka-2, when i reconnect they, the broker kafka-3 say "Broker may be not available". Other times with different broker the test works without a problems.. there is not a specific order to re-create ther error and not every time it happen. – Al3 Jan 13 '22 at 09:02
  • Another attempt: are disconnected kafka-3 and kafka-1 when i reconnected the brokers remain in infinite loop and say "Connection to node 1 (kafka-1/172.18.0.6:9092) could not be established. Broker may not be available." and " Recorded new controller, from now on will use broker kafka-1:9092 (id: 1 rack: null)". The kafka-2 ( not disconnected ) say "Controller to broker 1-3 connection could not be established. Broker may not be available. – Al3 Jan 13 '22 at 11:57

0 Answers0