I am trying to run a pipeline using apache-beam with source as one kafka topic and destination as another kafka topic. I have written my code and is working well(i.e., no error in code I think). But I cannot see data in my output topic This is the code :
import apache_beam as beam
import apache_beam.transforms.window as window
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.io.external.kafka import ReadFromKafka, WriteToKafka
def run_pipeline():
with beam.Pipeline(options=PipelineOptions()) as p:
(p
| 'Read from Kafka' >> ReadFromKafka(consumer_config={'bootstrap.servers':'localhost:9092',
'auto.offset.reset': 'latest'}, topics=['demo'])
| 'Window of 10 seconds' >> beam.WindowInto(window.FixedWindows(10))
#| 'Group by key' >> beam.GroupByKey()
| 'Write to Kafka' >> WriteToKafka(producer_config={'bootstrap.servers':'localhost:9092'},
topic='demo_output'))
#| 'Write to console' >> beam.Map(print)
#| 'Write to text' >> beam.io.WriteToText('outputfile.txt')
if __name__ == '__main__':
run_pipeline()
https://maximilianmichels.com/2020/getting-started-with-beam-python/
This is the actual blog post that I am trying to follow.
I used the console to produce my source kafka messages.
$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic demo --property "parse.key=true" --property "key.separator=:"
But, I still am not able to see my messages being pushed to my destination topic when I try to consume them.
$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic demo_output