1

I am using Confluentinc Kafka with Python & multi-threading. In this I have N worker threads running in parallel, whenever a thread completes its work it poll the message from kafka on demand. This whole job is done using the while loop. By using the while loop my main thread gets blocked & there is no other operation can be performed.

Below is the sample of my code:

import concurrent.futures

with concurrent.futures.ThreadPoolExecutor(5) as executor:
    while True:
        counter = 0
        for future in futures:
            is_running = future.running()
            if is_running:
                counter += 1

        avail_slots = 5 - counter
        if avail_slots > 0:
            for message in get_poll_message(avail_slots):
                future = executor.submit(
                    message_thread_executor, message=message
                )
                futures.append(future)
        elif avail_slots == 0:
            time.sleep(10)

def get_poll_message(avail_slots)
    raw_messages = kafka_consumer.poll(max_records=avail_slots)
    msgs = []
    for topic_partition, message in raw_messages.items():
        for msg in message:
            msgs.append(msg)

    return msgs

I am looking if there is any other way to do that in Python instead of using the while loop? I want to remove the while loop so that my main thread does not get block.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245

1 Answers1

0

You could use supervisor Python library to run 5 processes in parallel with one consumer. That would simplify your code and offer you better process management.

Otherwise, your while loop should be in the Thread body with a callback for the records it had polled, not in the main loop, iterating over each future, and only passing one message at a time to an executor.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245