0

Following is my use case scenario where one application will push data to three different kafka topics (there is unique app id) and the output will go to the subsequent queue 4 and queue 5. I have already implemented the pipeline shown below.

The only problem I am facing how to combine all the output for a particular app_id from topic 5. The application pushes multiple requests with each having a unique id in this pipeline. So all the request for a particular app_id may not be in sequence. There may be other app_id data in queue 5.

Should I use different group_id for each of the app_id while creating the consumer for topic 5?

Please help me if you have any idea. I am using kafka-python.

from kafka import KafkaConsumer, KafkaProducer
KAFKA = dict()
KAFKA['producer'] = KafkaProducer(bootstrap_servers=[server]))
for queue in ['queue 1', 'queue 2', 'queue 3', 'queue 4', 'queue 5']:
    KAFKA['queue'] = KafkaConsumer(queue,
                                          bootstrap_servers=[server],
                                          auto_offset_reset='earliest', enable_auto_commit=True,
                                          auto_commit_interval_ms=1000, group_id='group'+queue) 

enter image description here

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Kanchan Sarkar
  • 423
  • 2
  • 11

1 Answers1

1

If you just want to read three topics at once, then you'd do KafkaConsumer('1,2,3')

I would also recommend faust if the goal is to have multiple topic chains like this

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245