0

I have an EventTime streaming application that uses the CEP library for a basic three-step pattern on a joined stream. The joined stream is a combination of live, watermarked, and windowed data and a stream of historical items outside of the windowing/watermarking.

The setup is similar to the dataArtisans blog post except with the CEP Pattern as the last step.

Our CEP setup looks like this, and worked before adding in the non-timestamped historical stream. The EscalatingAlertEventIterativeCondition makes sure that the previous event match is of a greater level than the next.

Pattern<AlertEvent, ?> pattern = Pattern.<AlertEvent>
        begin("one")
        .where((AlertEvent event) -> event.level > 0)
        .next("two")
        .where(new EscalatingAlertEventIterativeCondition("one"))
        .next("three")
        .where(new EscalatingAlertEventIterativeCondition("two"));

return CEP.pattern(
        alertEventStream,
        pattern
);

The problem I'm seeing is that CEP is forever buffering (breakpoints within the filter and iterative conditions are now not hit) and that the filtering/selection never happens. I initially thought this could be due to the CEP buffer but am unsure as I am new to both Flink and Flink CEP. Is there any way to avoid the lateness buffer, or does something else look amiss?

Our job graph, where only the top, live stream of data is timestamped and watermarked: enter image description here

austin_ce
  • 1,063
  • 15
  • 28
  • CEP doesn't have any allowed lateness; late events are either dropped or sent to a side output. https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/libs/cep.html#handling-lateness-in-event-time – David Anderson Dec 11 '18 at 06:37
  • Thanks David, yeah I've read that and was just wondering if there was any way to get around it as we handle lateness and windowing elsewhere further up the pipeline. – austin_ce Dec 11 '18 at 14:50

0 Answers0