1

How to detect babbling patterns using the flink cep library ?

Example: Let say devices have some problem so, continuously it publishes values like on, off. How to detect patterns using CEP, If a problem exists for 30 mins. Some sample data I mentioned below.

OFF     16/08/18 11:38
ON      16/08/18 11:38
OFF     16/08/18 11:38
ON      16/08/18 11:37
OFF     16/08/18 11:37
ON      16/08/18 11:36
OFF     16/08/18 11:36
OFF     16/08/18 11:36
ON      16/08/18 11:36
OFF     16/08/18 11:35
ON      16/08/18 11:35
ON      16/08/18 11:34
OFF     16/08/18 11:34
MadProgrammer
  • 513
  • 5
  • 18
  • Is this a question about algorithms or tooling? – David Anderson May 15 '20 at 14:44
  • I need to generate alerts for faulty devices. As you can see in above case faulty means every some sec duration device is sending ON or OFF continuously. – MadProgrammer May 15 '20 at 16:01
  • So what exactly is it that characterizes faulty devices? Is the event rate alone enough, or do you also need to consider that it is switching rapidly between OFF and ON? – David Anderson May 15 '20 at 16:06
  • I need to only consider rapidly switching devices ON and OFF for a particular duration of time. Let say if the device is rapidly switching to ON and OFF for 15 mins raise the alert. – MadProgrammer May 16 '20 at 05:42

1 Answers1

1

If your stream is in-order by time (it only matters that the stream is sorted for each individual device), then you could easily transform the stream to make this analysis easier. A RichFlatMapFunction like this will transform the sequence of ON OFF events into a sequence of state CHANGE events:

static class DetectChanges extends RichFlatMapFunction<String, String> {
    private transient ValueState<String> previousState;

    @Override
    public void open(Configuration parameters) throws Exception {
        previousState = getRuntimeContext().getState(new ValueStateDescriptor<>("previousState", String.class));
    }

    @Override
    public void flatMap(String onOrOff, Collector<String> out) throws Exception {

        if (previousState.value() != onOrOff) {
            out.collect("CHANGE");
            previousState.update(onOrOff);
        }
    }
}

Now the problem has been reduced to determining if the stream has some number of CHANGE events during an interval of time. This could easily be done with sliding windows, or you could use CEP if you like.

You could also do this entirely with CEP. Conceptually you might approach this as follows:

  1. define an individual Pattern that matches ON+ OFF+
  2. then define a Pattern group that matches that ON/OFF pattern whenever it occurs n times within some time interval
David Anderson
  • 39,434
  • 4
  • 33
  • 60
  • Is it possible only by using CEP createPattern datastream ? – MadProgrammer May 16 '20 at 13:45
  • Yes, definitely. But it's more difficult (in my opinion), which is why I didn't do it that way. – David Anderson May 16 '20 at 13:47
  • OK.. This is fine, But in our case If I will create the pattern using CEP a little bit easier. If I will do any other changes that may affect major areas. Thanks if you will share the CEP pattern solution. That will be really helpful. – MadProgrammer May 16 '20 at 13:55
  • I've thought of a way to do it with CEP that's easier than I originally thought possible, which I've added to my answer. If you get stuck writing the code, please share your work-in-progress. – David Anderson May 16 '20 at 14:01
  • BTW, you could also do this with Flink SQL and MATCH_RECOGNIZE. – David Anderson May 16 '20 at 14:04
  • If I create a pattern only one problem is the 'n' number of matches it will try. Like Example: "OFF   16/08/18 11:38" will match with "OFF   16/08/18 11:37" as well as "OFF   16/08/18 11:36" again which may include the false positive scenarios. – MadProgrammer May 16 '20 at 14:12
  • And flink SQL is more about batch processing right ? – MadProgrammer May 16 '20 at 14:14
  • Flink SQL applies to both streaming and batch – David Anderson May 16 '20 at 14:36