2

watermark

Just curious, if the data source doesn't emit data for a while, whether Flink operator still receive watermark or not? the Low Watermark in the image will keep advancing or stay still?

Grant
  • 500
  • 1
  • 5
  • 18

1 Answers1

2

Whether the watermark will continue to advance when the source is completely idle depends on the WatermarkStrategy. Neither of the built-in strategies (i.e., BoundedOutOfOrderness or MonotonousTimestamps) will advance the watermark under these circumstances, but some folks use custom strategies that detect idleness and advance the watermark based on the passage of wall-clock time.

If some of the source partitions/splits/shards are idle and others are not, then this could lead to the watermark failing to advance. Some sources support a watermark strategy that implements a withIdleness option that overcomes this problem [1].

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/event_timestamps_watermarks.html#dealing-with-idle-sources

David Anderson
  • 39,434
  • 4
  • 33
  • 60
  • Does the withIdleness strategy work on connected streams? I am facing a similar issue as reported in https://issues.apache.org/jira/browse/FLINK-18934. Is there any workaround for detecting idleness in a connected stream? – Sairam Sankaran Feb 24 '21 at 18:40
  • If FLINK-18934 is preventing withIdleness from working for you, you can work around this with a hack, by using a rebalance before watermarking so that the idle stream is blended with non-idle streams, or you can use something like https://github.com/aljoscha/flink/blob/6e4419e550caa0e5b162bc0d2ccc43f6b0b3860f/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/timestamps/ProcessingTimeTrailingBoundedOutOfOrdernessTimestampExtractor.java to handle the idleness yourself. – David Anderson Feb 24 '21 at 19:26
  • @SairamSankaran did you find any workaround? FLINK-18934 its fixed and released on version 1.14.0, but still doesn't work for me. – shbunjaku Jan 14 '22 at 10:26