I have following use case.
There is one machine which is sending event streams to Kafka which are being received by CEP engine
where warnings are generated when conditions are satisfied on the Stream data.
FlinkKafkaConsumer011<Event> kafkaSource = new FlinkKafkaConsumer011<Event>(kafkaInputTopic, new EventDeserializationSchema(), properties);
DataStream<Event> eventStream = env.addSource(kafkaSource);
Event POJO contains id, name, time, ip.
Machine will send huge data to Kafka and there are 35 unique event names from machine (like name1, name2 ..... name35) and I want to detect patterns for each event name combination (like name1 co-occurred with name2, name1 co-occurred with name3.. etc). I got totally 1225 combinations.
Rule POJO contains e1Name and e2Name.
List<Rule> ruleList -> It contains 1225 rules.
for (Rule rule : ruleList) {
Pattern<Event, ?> warningPattern = Pattern.<Event>begin("start").where(new SimpleCondition<Event>() {
@Override
public boolean filter(Event value) throws Exception {
if(value.getName().equals(rule.getE1Name())) {
return true;
}
return false;
}
}).followedBy("next").where(new SimpleCondition<Event>() {
@Override
public boolean filter(Event value) throws Exception {
if(value.getName().equals(rule.getE2Name())) {
return true;
}
return false;
}
}).within(Time.seconds(30));
PatternStream patternStream = CEP.pattern(eventStream, warningPattern);
}
Is this correct way to execute multiple patterns on one stream data or is there any optimized way to achieve this. With above approach we are getting PartitionNotFoundException
and UnknownTaskExecutorException
and memory issues.