10

I have a use case of high throughput kafka producer where I want to push thousands of json messages every second.

I have a 3 node kafka cluster and I am using latest kafka-python library and have following method to produce message

def publish_to_kafka(topic):
    data = get_data(topic)
    producer = KafkaProducer(bootstrap_servers=['b1', 'b2', 'b3'],
                             value_serializer=lambda x: dumps(x).encode('utf-8'), compression_type='gzip')
    try:
        for obj in data:
           producer.send(topic, value=obj)
    except Exception as e:
            logger.error(e)
    finally:
        producer.close()

My topic has 3 partitions.

Methods works correctly sometimes and fails with error "KafkaTimeoutError: Failed to update metadata after 60.0 secs."

What settings I needs to change to get it work smoothly?

Sarvesh Kumar
  • 131
  • 1
  • 1
  • 4
  • Can you share your Kafka broker configuration (`server.properties`) ? Also, when you say that it _sometimes_ fail, do you mean using the exact same topic? – Giorgos Myrianthous Jun 04 '20 at 16:26

1 Answers1

11
  1. If a topic does not exist and you are trying to produce to that topic and auto topic creation is set to false, then it can occur.

    Possible resolution: In broker configuration (server.properties) auto.create.topics.enable=true (Note, this is default in Confluent Kafka)

  2. Another case could be network congestion or speed, if it is taking more than 60 sec to update metadata with the Kafka broker.

    Possible resolution: Producer configuration: max.block.ms = 1200000 (120 sec, for ex)

  3. Check if your broker(s) are going down for some reason (for ex, too much load) and why they are not able to respond to metadata requests. You can see them in server.log file, typically.

JavaTechnical
  • 8,846
  • 8
  • 61
  • 97