0

I am trying to tweak my windowing parameter in my streaming Beam pipeline. The parameters that I am modifying are withAllowedLateness, triggers, interval, pane-firing, etc. However I don't know how to trigger lateness in my Kafka consuming pipeline to test the changes. Can anybody suggest how to create event lateness?

Thanks

Fabio
  • 555
  • 3
  • 9
  • 24

1 Answers1

0

Do you use kafka published time as the window time or custom field? Most of the time we are doing the window on custom date field (which most cases makes more sense, since you want to group on some logical time, in cases the publishers has some issues and it also publish messages with some delay) and then it's very easy to simulate "late data" just by sending events with custom date field contains some past date time.

Do you use order messages when consuming the data? if so you can continue publish data to your kafka topic and not reading it at all. then start the Beam job when you have huge backlog, most times when there is a backlog, messages are read not in order and it cause more data to arrive after the window is closed, which is late data.

Brachi
  • 637
  • 9
  • 17