0

I am trying to stream messages from kafka consumer to with 30 seconds windows using apache beam. Used beam_nuggets.io for reading from a kafka topic.

You can see my code below:

with beam.Pipeline(options=PipelineOptions()) as p:
    consumer_message = (p | "Reading messages from Kafka" >> kafkaio.KafkaConsume(consumer_config=consumer_config)
                        | 'window' >> beam.WindowInto(window.FixedWindows(30))
                        | 'groupBy' >> beam.GroupByKey()
                        | beam.Map(print))

GroupByKey still produces no output.

my consume_message :

(None, '{"userId": null, "visitorId": "1cb8b48d-6495-44fc-9ba5-ba28d71933a7", "ip": "10.212.134.89", "userAgent": "Mozilla/5.0 (iPhone; CPU iPhone OS 10_3_1 like Mac OS X) AppleWebKit/603.1.30 (KHTML, like Gecko) Version/10.0 Mobile/14E304 Safari/602.1", "referer": "https://test.xxx.com/", "clientName": "xxx.com", "clientTypeId": "0", "sequenceAtSession": "1", "sessionId": "8f098d91-9049-49d0-ae52-63dffda76936", "url": null, "dimension": null, "event": {"category": null, "action": "pageview", "label": null, }, "startDate": "2021-10-18T07:05:46.9244107+00:00", "endDate": "", "pageType": "homePage", "countryCode": "ZZ", "isp": "Private network", "usageType": "reserved", "organization": "Rfc 1918"}')

GroupByKey() can do it because the key for all my messages is 'None', please help if I'm wrong. Thanks

Olaf Kock
  • 46,930
  • 8
  • 59
  • 90

1 Answers1

0

It looks like the trigger is not fired. Since you have used the default trigger implicitly, it is supposed to fire at the end of the window plus allowed lateness.

This might be the result of the watermark not advancing. Did you try to send new events after the end of the window?

Deniz Acay
  • 1,609
  • 1
  • 13
  • 24