The topic I am consuming data is distributed in 3 brokers and 18 partitions. Each broker is the leader for 6 partitions. How can I list the leader of each partition on my Python application? (I am using confluent kafka library).
I need to do this because I have 3 different apps, one per broker and I want them to programmatically check and consume data from all partitions that the broker is the leader.
Currently I have hardcoded the partitions on each broker, but I want to change this so that in case a broker goes down, I will commence consuming those partitions from the other two brokers.
Ideally I would like to exted the following code to include the leader for each topic:
metadata = consumer.list_topics(topic, timeout=10)
if metadata.topics[topic].error is not None:
raise confluent_kafka.KafkaException(metadata.topics[topic].error)
partitions = [TopicPartition(topic, p) for p in metadata.topics[topic].partitions]
committed = consumer.committed(partitions, timeout=10)
for partition in committed:
# Get the partitions low and high watermark offsets.
(lo, hi) = consumer.get_watermark_offsets(partition, timeout=10, cached=False)
if partition.offset == confluent_kafka.OFFSET_INVALID:
offset = "-"
else:
offset = "%d" % (partition.offset)
if hi < 0:
lag = "no hwmark" # Unlikely
elif partition.offset < 0:
# No committed offset, show total message count as lag.
# The actual message count may be lower due to compaction
# and record deletions.
lag = "%d" % (hi - lo)
else:
lag = "%d" % (hi - partition.offset)
print("%-50s %9s %9s" % (
"{} [{}]".format(partition.topic, partition.partition), offset, lag))
EDIT:
Following code from How to describe a topic using kafka client in Python
I added the following:
adminClient = AdminClient({
'bootstrap.servers': 'my_server'
})
fs = adminClient.describe_configs([ConfigResource(confluent_kafka.admin.RESOURCE_TOPIC, "bets")],
request_timeout=100)
topic_configResource = adminClient.describe_configs([ConfigResource(confluent_kafka.admin.RESOURCE_TOPIC, "bets")])
for j in concurrent.futures.as_completed(iter(topic_configResource.values())):
config_response = j.result(timeout=100)
Which throws timeout exception
Breaking down the code in order to try to debug, if I only run
fs = adminClient.describe_configs([ConfigResource(confluent_kafka.admin.RESOURCE_TOPIC, "bets")],
request_timeout=100)
fs
I get the following:
{ConfigResource(ResourceType.TOPIC,bets): <Future at 0x27f882fb100 state=running>}
After the requiest timeout I have set, I get the following:
{ConfigResource(ResourceType.TOPIC,bets): <Future at 0x27f882fb100 state=finished raised KafkaException>}
On a side note, using Conduktor provides ISR and leader information per briker, therefore those informations are available on the cluster.