2

For example A should be followed by B within 10 seconds. I know how to track if this DID occur (.next, .within), but I want to send an alert if B never happened within the window.

    public static void main(String[] args) throws Exception {
        final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

        // checkpointing is required for exactly-once or at-least-once guarantees
//      env.enableCheckpointing(1000);

        final RMQConnectionConfig connectionConfig = new RMQConnectionConfig.Builder()
            .setHost("localhost")
            .setPort(5672)
            .setVirtualHost("/")
            .setUserName("guest")
            .setPassword("guest")
            .build();

        final DataStream<String> inputStream = env
            .addSource(new RMQSource<String>(
                connectionConfig,               // config for the RabbitMQ connection
                "cep",                          // name of the RabbitMQ queue to consume
                true,                           // use correlation ids; can be false if only at-least-once is required
                new SimpleStringSchema()))      // deserialization schema to turn messages into Java objects
            .setParallelism(1);                 // non-parallel source is only required for exactly-once

        inputStream.print();

        Pattern<String, ?> simplePattern =
                Pattern.<String>begin("start")
                    .where(new SimpleCondition<String>() {
                        @Override
                        public boolean filter(String event) {
                            return event.equals("A");
                        }
                    })
                    .next("end")
                    .where(new SimpleCondition<String>() {
                        @Override
                        public boolean filter(String event) {
                            return event.equals("B");
                        }
                    });

        PatternStream<String> timedOutPatternStream = CEP.pattern(inputStream, simplePattern.within(Time.seconds(10)));
        OutputTag<String> timedout = new OutputTag<String>("timedout"){};
        SingleOutputStreamOperator<String> timedOutNotificationsStream = timedOutPatternStream.flatSelect(
            timedout,
            new TimedOut<String>(),
            new FlatSelectNothing<String>()
        );
        timedOutNotificationsStream.getSideOutput(timedout).print();

        env.execute("mynotification");
    }

public static class TimedOut<String> implements PatternFlatTimeoutFunction<String, String> {
    @Override
    public void timeout(Map<java.lang.String, List<String>> pattern, long timeoutTimestamp, Collector<String> out) throws Exception {
        out.collect((String) "LATE!");
    }
}

public static class FlatSelectNothing<T> implements PatternFlatSelectFunction<T, T> {
    @Override
    public void flatSelect(Map<String, List<T>> pattern, Collector<T> collector) {}
}

Actual behavior:

publish "A"
(wait 5 seconds)
publish "B"
=> (no alert)

publish "A"
(wait 10 seconds)
=> (no alert, but should be)

publish "A"
(wait 10 seconds)
publish "B"
=> "LATE!"

Expected behavior:

publish "A"
(wait 10 seconds)
=> "LATE!"
atkayla
  • 8,143
  • 17
  • 72
  • 132

2 Answers2

2

You can do it via timed out patterns. You can specify pattern like A followedBy B within 10 seconds and check for patterns that timed out, which means that there were only A's. You can check docs for timed out patterns here

For a full example you can refer to this training or straight to the solution to the excercise.


EDIT: Right now (flink <1.5) in processing time pruning is done only on incoming element. Therefore unfortunately after the timeout there must be at least one event(irrelevant if matching or not) that will trigger the timeout. Efforts to improve it can be tracked with this jira ticket

Dawid Wysakowicz
  • 3,402
  • 17
  • 33
  • "Timed Out Partial Patterns" was the term I needed. Thanks, appreciate it! – atkayla May 15 '18 at 19:24
  • Actually, I think I may be doing something wrong. Please see my updated post. I am unable to get the "late" notification until I publish the B event. This is not what I want because B could never happen (e.g. package gets lost) and I'd still like to alert if that was the case. – atkayla May 16 '18 at 00:20
  • @kayla I've updated my answer, but next time rather than edit the question, so that it changes the meaning completely create new one. Otherwise a good answer may end up inaccurate. – Dawid Wysakowicz May 16 '18 at 07:07
  • Sorry for the inconvenience. So it sounds like what I'm asking for cannot currently be done. Shouldn't be too much of a problem, because if you can imagine some shipping system, shipping events should always be coming in unless the entire shipping system is down! Yes? – atkayla May 16 '18 at 14:59
  • My apologies, but I played around with it some more, and not sure how I can get this to suffice. It seems like I can only handle "package WAS late" but not "package IS late/lost". If you can imagine packageId A and B. If I do `input.keyBy("packageId")` these will get checked separately, so I can't do "next" event to trigger "A is late" by sending an event for B. I have try ship A -> (time window expires) and that's it. Not try ship A -> (time window expires) -> successfully shipped A, so it never gets the last event, and the package is missing with no way to detect and send the notification? – atkayla May 16 '18 at 19:02
  • Any progress on this issue? On the jira I've seen some activity, but the ticket is still open. I'm confronted with the same case. Ar https://ci.apache.org/projects/flink/flink-docs-stable/dev/event_time.html,Idling sources,suggest to use "an assigner that switches to using current processing time as the time basis after not observing new events for a while". Unfortunately, no example is provided on this direction.Do you have any clues? – florins Dec 12 '18 at 15:04
0

Can you try with below solution?

package com.nirav.modi.cep;

import com.nirav.modi.dto.Event;
import org.apache.flink.cep.CEP;
import org.apache.flink.cep.PatternFlatSelectFunction;
import org.apache.flink.cep.PatternFlatTimeoutFunction;
import org.apache.flink.cep.PatternStream;
import org.apache.flink.cep.pattern.Pattern;
import org.apache.flink.cep.pattern.conditions.SimpleCondition;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.source.SourceFunction;
import org.apache.flink.streaming.api.windowing.time.Time;
import org.apache.flink.util.Collector;
import org.apache.flink.util.OutputTag;

import java.util.List;
import java.util.Map;

public class EventNotOccur {

    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

        DataStreamSource<Event> source = env.addSource(new SourceFunction<Event>() {
            @Override
            public void run(SourceContext<Event> ctx) throws Exception {

                for (int i = 0; i < 1; i++) {
                    ctx.collect(new Event("A"));
                    Thread.sleep(5000);
                    ctx.collect(new Event("B"));
                    Thread.sleep(5000);
                    ctx.collect(new Event("A"));
                    Thread.sleep(15000);
                    ctx.collect(new Event("B"));
                    Thread.sleep(5000);
                    ctx.collect(new Event("B"));
                }
            }

            @Override
            public void cancel() {

            }
        });

        Pattern<Event, ?> simplePattern =
                Pattern.<Event>begin("start")
                        .where(new SimpleCondition<Event>() {
                            @Override
                            public boolean filter(Event event) {
                                return event.getName().equals("A");
                            }
                        })
                        .next("end")
                        .where(new SimpleCondition<Event>() {
                            @Override
                            public boolean filter(Event event) {
                                return event.getName().equals("B");
                            }
                        });

        source.print();

        PatternStream<Event> timedOutPatternStream = CEP.pattern(source, simplePattern.within(Time.seconds(10)));

        OutputTag<Event> timedout = new OutputTag<Event>("timedout") {

        };

        timedOutPatternStream.flatSelect(new PatternFlatSelectFunction<Event, String>() {
            @Override
            public void flatSelect(Map<String, List<Event>> pattern, Collector<String> out) throws Exception {
                out.collect("Pattern Match...............");
            }
        }).print();

        SingleOutputStreamOperator<Event> longRides = timedOutPatternStream
                .flatSelect(
                        timedout,
                        new EventTimeOut(),
                        new FlatSelectNothing()
                );

        longRides.getSideOutput(timedout).print();


        env.execute("Flink Streaming Java API Skeleton");

    }

    public static class EventTimeOut<Event> implements PatternFlatTimeoutFunction<Event, Event> {
        @Override
        public void timeout(Map<String, List<Event>> map, long l, Collector<Event> collector) throws Exception {
            Event rideStarted = map.get("start").get(0);
            System.out.println("Time out Partial Event : " + rideStarted);
            collector.collect(rideStarted);
        }
    }

    public static class FlatSelectNothing<T> implements PatternFlatSelectFunction<T, T> {
        @Override
        public void flatSelect(Map<String, List<T>> pattern, Collector<T> collector) {

            System.out.println("Flat select nothing: " + pattern.get("start").get(0));
            collector.collect(pattern.get("start").get(0));

        }
    }
}
NIrav Modi
  • 6,038
  • 8
  • 32
  • 47
  • This doesn't seem much different from the sample code in the question. It would require the next event to trigger the timeout, yes? So A->(10 seconds) won't work, only A->(10 seconds)->B. – atkayla May 16 '18 at 14:56