0

I am new to the Apache Flink Api and I am trying to understand the different windows it offers.

I have a stream of events such as:

    device_id,trigger_id,event_time,messageId
    1,START,1520433909396,1
    1,TRACKING,1520433914398,2
    1,TRACKING,1520433919398,3
    1,STOP,1520433924398,4
    1,START,1520433929398,5
    1,TRACKING,1520433934399,6
    1,TRACKING,1520433939399,7
    1,TRACKING,1520433944399,8
    1,STOP,1520433949399,9

Where trigger_id can be an indicator such as: start,tracking,stop

What I would like to do is based on device_id group all incoming events and define a window based on the trigger_id. I.e group all events from start until stop and then do some calculations such as: average,max etc.

This could be defined as GlobalWindow and a Custom Trigger based on the trigger_id and use a Custom Evictor to evict the list of Events each time a stop trigger is reach.

Another option could be to use Flink CEP . I have defined the following pattern

DataStream<String> input = env.readTextFile("events.csv");

    // create event stream
    DataStream<Event> events = input.map(new LineToEvent());
    DataStream<Event> waterMarkedStreams = events.assignTimestampsAndWatermarks(new EventAssigner());

    Pattern<Event, Event> tripPattern =
            Pattern.<Event>begin("start",  AfterMatchSkipStrategy.noSkip())
                    .where(START_CONDITION)
                    .followedBy("middle").where(MIDDLE_CONDITION).oneOrMore()
                    .followedBy("end").where(END_CONDITION);
    PatternStream<Event> patternStream = CEP.pattern(waterMarkedStreams, tripPattern);

    DataStream<String> result = patternStream.select(
            new PatternSelectFunction<Event, String>() {
                @Override
                public String select(Map<String, List<Event>> pattern) throws Exception {

                    StringBuilder builder = new StringBuilder();
                    builder.append(pattern.get("start").get(0).getMessageId()).append(",");
                    List<Event> vals = pattern.get("middle");
                    for (Event e: vals) {
                        builder .append(e.getMessageId()).append(",");
                    }
                    builder.append(pattern.get("end").get(0).getMessageId()).append(",");
                    return builder.toString();
                }
            });

    result.print();

Where all conditions are static inner classes implementing SimpleCondition

However the pattern matches alll possible solution on the stream of events as so...

    1> 1,2,3,4,
    1> 1,2,3,6,9,
    2> 1,2,4,
    2> 5,6,7,8,9,
    3> 1,2,3,6,7,8,9,
    3> 5,6,7,9,
    4> 1,2,3,6,7,9,
    4> 5,6,9,

Does pattern have a notion of Evictor? How can you keep only the specific set of events. i.e.

1,2,3,4,
5,6,7,8,9,
  • The example at http://training.data-artisans.com/exercises/carSegments.html is fairly similar. – David Anderson Mar 01 '18 at 10:47
  • thanks for the tip will look into it –  Mar 01 '18 at 12:39
  • I have looked at the example. And created a custom trigger which seems to work. I also looked at Flink CEP and wondered if i could do something similar. I have updated my question –  Mar 07 '18 at 13:34

1 Answers1

0

Yes, you should be able to do this with CEP. You need to do more than just define the pattern, you also have to apply the pattern to the stream, and then select the matching sequences and use those to emit some results. There's a fairly complete example in the documentation.

David Anderson
  • 39,434
  • 4
  • 33
  • 60
  • I have update my question. To include more than just the pattern I used. –  Mar 09 '18 at 13:17