0

I have set up NiFi (1.11.4) & Kafka(2.5) via docker (docker-compose file below, actual NiFi flow definition https://github.com/geoHeil/streaming-reference). When trying to follow up on basic getting started tutorials (such as https://towardsdatascience.com/big-data-managing-the-flow-of-data-with-apache-nifi-and-apache-kafka-af674cd8f926) which combine processors such as:

  • generate flowfile (CSV)
  • update attribute
  • PublishKafka2.0

I run into issues of timeoutException:

nifi_1            | 2020-06-10 11:15:47,311 ERROR [kafka-producer-network-thread | producer-2] o.a.n.p.k.pubsub.PublishKafkaRecord_2_0 PublishKafkaRecord_2_0[id=959f0e64-0172-1000-0000-0000650181a4] Failed to send StandardFlowFileRecord[uuid=944464e4-94ea-48dc-89fa-d19c34f163e7,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1591771086044-1, container=default, section=1], offset=227962, length=422],offset=0,name=c8dd1dd2-0ffe-4875-9d45-902ea331c210,size=422] to Kafka: org.apache.kafka.common.errors.TimeoutException: Expiring 64 record(s) for test-0: 30029 ms has passed since batch creation plus linger time
nifi_1            | org.apache.kafka.common.errors.TimeoutException: Expiring 64 record(s) for test-0: 30029 ms has passed since batch creation plus linger time
nifi_1            | 2020-06-10 11:15:47,311 ERROR [kafka-producer-network-thread | producer-2] o.a.n.p.k.pubsub.PublishKafkaRecord_2_0 PublishKafkaRecord_2_0[id=959f0e64-0172-1000-0000-0000650181a4] Failed to send StandardFlowFileRecord[uuid=944464e4-94ea-48dc-89fa-d19c34f163e7,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1591771086044-1, container=default, section=1], offset=227962, length=422],offset=0,name=c8dd1dd2-0ffe-4875-9d45-902ea331c210,size=422] to Kafka: org.apache.kafka.common.errors.TimeoutException: Expiring 64 record(s) for test-0: 30029 ms has passed since batch creation plus linger time
nifi_1            | org.apache.kafka.common.errors.TimeoutException: Expiring 64 record(s) for test-0: 30029 ms has passed since batch creation plus linger time

However, a:

kafkacat -C -b localhost:9092 -t test #starts listener
kafkacat -P -b localhost:9092 -t test #starts producer

pipes events just fine through the kafka instance.

The docker-compose file looks like:

version: "3"
services:
  nifi:
    image: apache/nifi:1.11.4
    ports:
      - 8080:8080 # Unsecured HTTP Web Port
    environment:
      - NIFI_WEB_HTTP_PORT=8080
      - NIFI_CLUSTER_IS_NODE=true
      - NIFI_CLUSTER_NODE_PROTOCOL_PORT=8082
      - NIFI_ZK_CONNECT_STRING=zookeeper:2181
      - NIFI_ELECTION_MAX_WAIT=1 min
    links:
      - broker
      - zookeeper
    volumes:
      - ./for_nifi/conf:/opt/nifi/nifi-current/conf
  zookeeper: #https://github.com/confluentinc/cp-all-in-one/blob/5.5.0-post/cp-all-in-one/docker-compose.yml#L5
    image: confluentinc/cp-zookeeper:5.5.0
    hostname: zookeeper
    container_name: zookeeper
    ports:
      - "2181:2181"
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000
  broker:
    image: confluentinc/cp-kafka:5.5.0
    hostname: broker
    container_name: broker
    depends_on:
      - zookeeper
    ports:
      - "29092:29092"
      - "9092:9092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
Georg Heiler
  • 16,916
  • 36
  • 162
  • 292

1 Answers1

2

You're using the wrong port to connect to the broker. By connecting to 9092 you connect to the listener that advertises localhost:9092 to the client for subsequent connections. That's why it works when you use kafkacat from your local machine (because 9092 is exposed to your local machine)

If you use broker:29092 then the broker will give the client the correct address for the connection (i.e. broker:29092).

To understand more about advertised listeners see this blog

Robin Moffatt
  • 30,382
  • 3
  • 65
  • 92
  • Great blog post. But there, you also only use 9092, never 29092. However, I can confirm: indeed your solution works just great. – Georg Heiler Jun 10 '20 at 13:00
  • Glad you got it working. In that post the broker may well be configured differently from yours, which is why the port numbers might not match up with what you're using. – Robin Moffatt Jun 10 '20 at 13:37