While studying Flink CEP library over the last few days, I've been under the impression that It doesn't add any new fundamental functionality to Flink's standard capabilities. It seems like Flink CEP's only purpose is to make event processing easier, with clear semantics and intuitive code structure. As an example, Flink CEP presents only 5 semantics of event match skipping. Although these semantics may be enough for a great range of cases, it may not solve specific problems, which makes us return to plain Flink.
A test case is the following pattern :
Emmit a alert(represented by 'a') for each non-overlapping pair of numbers in a stream
Represented by the pattern:
Pattern.begin[EventType]("pair",skipStrategy).where(new AlwaysTrueFunction()).times(2)
So, for a input like (numbers entering from left to right on the stream) 1 1 1 1 1
, the expected output would be a a
, but none of the 5 match skipping strategies would give the right result:
No-skip: a a a a
Skip-to-next: a a a a
Skip-past-last-event: a a a a
Skip-to-first[1]: a a a a
Skip-to-last[1]: a a a a
Although these strategies can't generate the desired pattern, It could be easily made using a RichFunction
with a ValueState
counter to determine when a new alert should be emmited, transforming the input stream in a stream of events.
Thus, I would appreciate some light over these questions:
Why was CEP library created if Flink seems to be more complete?
A pattern made with CEP is more efficient(greater throughput/other metric) than one made with Flink standard DataStream operators?(if possible, with some links provided for articles/papers/documentation about this)