I have an EventTime streaming application that uses the CEP library for a basic three-step pattern on a joined stream. The joined stream is a combination of live, watermarked, and windowed data and a stream of historical items outside of the windowing/watermarking.
The setup is similar to the dataArtisans blog post except with the CEP Pattern as the last step.
Our CEP setup looks like this, and worked before adding in the non-timestamped historical stream. The EscalatingAlertEventIterativeCondition
makes sure that the previous event match is of a greater level than the next.
Pattern<AlertEvent, ?> pattern = Pattern.<AlertEvent>
begin("one")
.where((AlertEvent event) -> event.level > 0)
.next("two")
.where(new EscalatingAlertEventIterativeCondition("one"))
.next("three")
.where(new EscalatingAlertEventIterativeCondition("two"));
return CEP.pattern(
alertEventStream,
pattern
);
The problem I'm seeing is that CEP is forever buffering (breakpoints within the filter and iterative conditions are now not hit) and that the filtering/selection never happens. I initially thought this could be due to the CEP buffer but am unsure as I am new to both Flink and Flink CEP. Is there any way to avoid the lateness buffer, or does something else look amiss?
Our job graph, where only the top, live stream of data is timestamped and watermarked: