I have a simple program using flink CEP library to detect multiple failed login from a file of log records. My application uses Event time and I am doing a keyBy on the logged in 'user'.
The program works fine when I set the StreamExecutionEnvironment parallelism to 1. It fails when parallelism is anything else. I am unable to understand why.
I can see that all records related to a particular user is going to the same thread, so why the issue. Also see that the records are on many occasions not in event time order (not sure if that is a problem) but I couldn't find anything in the api to let me sort the records by event time within a window.
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
env.getConfig().setAutoWatermarkInterval(1000);
env.setParallelism(1); //tried with 1 & 4
.....
DataStream<LogEvent> inputLogEventStream = env
.readFile(format, FILEPATH, FileProcessingMode.PROCESS_CONTINUOUSLY, 1000)
.map(new MapToLogEvents())
.assignTimestampsAndWatermarks(new BoundedOutOfOrdernessTimestampExtractor<LogEvent>(Time.seconds(0)) {
public long extractTimestamp(LogEvent element) {
return element.getTimeLong();
}
})
.keyBy(new KeySelector<LogEvent, String>() {
public String getKey(LogEvent le) throws Exception {
return le.getUser();
}
});
inputLogEventStream.print();
Pattern<LogEvent, ?> mflPattern = Pattern.<LogEvent> begin("mfl")
.subtype(LogEvent.class).where(
new SimpleCondition<LogEvent>() {
public boolean filter(LogEvent logEvent) {
if (logEvent.getResult().equalsIgnoreCase("failed")) { return true; }
return false;
}
})
.timesOrMore(3).within(Time.seconds(60));
PatternStream<LogEvent> mflPatternStream = CEP.pattern(inputLogEventStream, mflPattern);
DataStream<Threat> outputMflStream = mflPatternStream.select(
new PatternSelectFunction<LogEvent, Threat>() {
public Threat select(Map<String, List<LogEvent>> logEventsMap) {
return new Threat("MULTIPLE FAILED LOGINS detected!");
}
});
outputMflStream.print();
Also reproduced below are print outputs when:
parallelism = 1 (It detected pattern successfully)
04/03/2018 12:03:53 Source: Custom File Source(1/1) switched to RUNNING
04/03/2018 12:03:53 SelectCepOperator -> Sink: Unnamed(1/1) switched to RUNNING
04/03/2018 12:03:53 Split Reader: Custom File Source -> Map -> Timestamps/Watermarks(1/1) switched to RUNNING
04/03/2018 12:03:53 Sink: Unnamed(1/1) switched to RUNNING
LogEvent [recordType=base18, eventCategory=login, user=paul, machine=laptop1, result=failed, eventCount=1, dataBytes=100, time=2018-03-26T22:30:08Z, timeLong=1522103408000]
LogEvent [recordType=base19, eventCategory=login, user=deb, machine=desktop1, result=failed, eventCount=1, dataBytes=100, time=2018-03-26T22:30:03Z, timeLong=1522103403000]
LogEvent [recordType=base20, eventCategory=login, user=deb, machine=desktop1, result=failed, eventCount=1, dataBytes=100, time=2018-03-26T22:30:05Z, timeLong=1522103405000]
LogEvent [recordType=base21, eventCategory=login, user=deb, machine=desktop1, result=failed, eventCount=1, dataBytes=100, time=2018-03-26T22:30:06Z, timeLong=1522103406000]
**THREAT** ==> MULTIPLE FAILED LOGINS detected!
parallelism = 4 (It failed to detect pattern)
04/03/2018 12:05:33 Split Reader: Custom File Source -> Map -> Timestamps/Watermarks(3/4) switched to RUNNING
04/03/2018 12:05:33 Split Reader: Custom File Source -> Map -> Timestamps/Watermarks(2/4) switched to RUNNING
04/03/2018 12:05:33 Sink: Unnamed(2/4) switched to RUNNING
04/03/2018 12:05:33 SelectCepOperator -> Sink: Unnamed(2/4) switched to RUNNING
04/03/2018 12:05:33 Sink: Unnamed(3/4) switched to RUNNING
04/03/2018 12:05:33 SelectCepOperator -> Sink: Unnamed(3/4) switched to RUNNING
2> LogEvent [recordType=base18, eventCategory=login, user=paul, machine=laptop1, result=failed, eventCount=1, dataBytes=100, time=2018-03-26T22:30:08Z, timeLong=1522103408000]
3> LogEvent [recordType=base21, eventCategory=login, user=deb, machine=desktop1, result=failed, eventCount=1, dataBytes=100, time=2018-03-26T22:30:06Z, timeLong=1522103406000]
3> LogEvent [recordType=base20, eventCategory=login, user=deb, machine=desktop1, result=failed, eventCount=1, dataBytes=100, time=2018-03-26T22:30:05Z, timeLong=1522103405000]
3> LogEvent [recordType=base19, eventCategory=login, user=deb, machine=desktop1, result=failed, eventCount=1, dataBytes=100, time=2018-03-26T22:30:03Z, timeLong=1522103403000]