0

I am confused if TUMBLE window will get calculated on regular interval and emit the elements for processing. example I have a query that is expected to work on interval 10 second.

select id, key from  eventTable  GROUP BY TUMBLE(rowTime, INTERVAL '10' SECOND), id, key ;

Now lets say: application receive event

  • E1 @10:00:00
  • E2 @10:00:05
  • E3 @12:00:10

As you can see E1 and E2 are reached within 5 sec and E3 reached at @12:00:15.

  • Can you please help me when E1 & E2 will get emitted for processing? will it be @10:00:11? or when E3 will come and then query will evaluate the window and will emit?
  • If it is after E3 then is there any way to make sure query executed in every 10 sec?

Appreciate your help on this.

Ashutosh
  • 33
  • 8

2 Answers2

1

If you are using event time processing, then the window that ends at 10:00:10 will be emitted when the watermark passes 10:00:10. If the watermarking is done in the usual bounded-out-of-orderness fashion, and if there are no other events, then the watermark won't advance until E3 is processed.

If you require a watermarking strategy that takes idleness into account, I believe your only option is to use the DataStream API to create the stream and apply watermarking that deals with idle sources, and then convert the DataStream to a Table.

Note that what .withIdleness(...) does is to mark a stream as idle, which keeps that stream from holding back the watermark. This solves the problem of one idle stream holding back the entire job if there are other, active streams. If you want the watermark to progress when absolutely nothing is happening, you'll need to do something more drastic.

The ideal solution is to have keepalive messages that originate from the same source, so that you know that the idleness is genuine, rather than an outage. Failing that, see ProcessingTimeTrailingBoundedOutOfOrdernessTimestampExtractor for an example of how to use a timer to detect idleness and advance the watermark based on the passage of time, rather than the arrival of new events. (Note that this example has not been updated to use the new WatermarkStrategy interface.)

David Anderson
  • 39,434
  • 4
  • 33
  • 60
  • Thanks David, I tried to implement WatermarkStrategy wmStrategy = WatermarkStrategy.forBoundedOutOfOrderness(Duration.ofMillis(10)) .withTimestampAssigner((eventResponse, l) -> eventResponse.getRowTime().getTime()) .withIdleness(Duration.ofMinutes(1)); but not working and last couple of event never trigger as source is idle and watermarking process is not happening. – Ashutosh Aug 09 '20 at 12:01
  • I've expanded my answer to explain what I think is going on in your case. – David Anderson Aug 09 '20 at 12:20
1

you can config tableEnv let table emit early:

 TableConfig config = bbTableEnv.getConfig();
 config.getConfiguration().setBoolean("table.exec.emit.early-fire.enabled", true);
 config.getConfiguration().setString("table.exec.emit.early-fire.delay", "1s");
Hayden Zhou
  • 389
  • 1
  • 13