We have a requirement in which, we need to purposefully unbalanced the Kafka cluster by assigning the traffic distribution for a 3 broker cluster like 80 % to Broker 1 15 % to Broker 2 5 % to Broker 3 and send the messages for the topics to the brokers according to the broker traffic distribution.
To implement this logic in python programming using kafka-python, we are calling the produce unbalanced message function from within the main function. The sample from the code implementing the logic is provided below:-
Main Function
def mf():
.
.
.
# create a topic if the topic doesn't exists. Tps_crtn will create new topic if no existing topics found else, will send messages to the existing topics, as usual.
tpc_list = tps_crtn(base_topic_name=bt, no_of_topics=int(ntp),
topic_partn=int(ptp),
repicas_per_partn=int(rpp))
#traffic distribution list
dl = [80,15,5]
while True:
for ix, topic in enumerate (tpc_list):
produce_unbalanced_message(topic_name=topic,
no_of_msgs=int(round((int(nm) * (float(dl[ix])/100.0)))),
max_wait_time=float(mwt)
if __name__ == "__main__":
mf()
The main function calls the below-mentioned Producer send function in order to send messages to every topic in the topic list.
Unbalance Produce message function
def produce_unbalanced_message(topic_name='test-topic',
no_of_msgs=-1,
max_wait_time=2):
kafka_admin_client: KafkaAdminClient = KafkaAdminClient(
bootstrap_servers='10.22.151.16:9100'
)
.
.
# List of all node ids in the cluster
LOG.info("Fetch the existing Kafka node list")
nodeids: List[int] = [node.nodeId for node in kafka_admin_client._client.cluster.brokers()]
for n in nodeids:
print(n)
.
.
.
# sending unbalanced messages to Kafka
producer.send(topic_name,
key=key,
value=message)
.
.
As per the requirement, the message should be sent according to the broker nos and corresponding traffic distribution list and not the topic list. The broker nos we are getting from the nodeids list in the produce_unbalanced_message functions.
However, on testing this code for a topic count of more than three by following the traffic distribution list parameter, we are getting- index out of bound error. The reason for this being in the topic list as soon as we increase their values, the traffic list distribution values are not matching as they are set according to the broker.
Can anyone please suggest what changes should be tried out such that messages are sent as per the broker nos obtained from nodeids list and corresponding traffic distribution list and not according to the topic list?