0

I want three values, they are aggValueInLastHour aggValueInLastDay aggValueInLastThreeDay.

I've tried like below.

enter image description here

But I don't want to wait, means that I'm not prefer to use sliding window to do aggregation.(3 day window must wait three days' data, this is unbearable for our system.)

How can I get last 3 day aggregation value when first event come?

Thanks for any advice in advance!

Brutal_JL
  • 2,839
  • 2
  • 21
  • 27
  • 1
    Do you mean like a rolling window? Sth like `OVER` aggregation in SQL: https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/table/sql.html#aggregations? – Dawid Wysakowicz May 03 '18 at 13:35

3 Answers3

0

If you want to get more frequent updates you can use QueryableState, polling the state at a rate that suits your use case.

Alex
  • 839
  • 1
  • 12
  • 24
0

You can make use of the ContinuousEventTimeTrigger, which will cause your window to fire on a shorter time period than the the full window, allowing you to see the intermediate state. You can optionally wrap that in a PurgingTrigger if the downstream consumers of your sink are expecting each output to be a partial aggregation (rather than the full current state) and sums them up.

Joshua DeWald
  • 3,079
  • 20
  • 16
0

I've tried CEP.

enter image description here

code:

AfterMatchSkipStrategy strategy = AfterMatchSkipStrategy.skipShortOnes();
    Pattern<RiskEvent, ?> loginPattern = Pattern.<RiskEvent>begin("start", strategy)
            .where(eventTypeCondition)
            .timesOrMore(1)
            .greedy()
            .within(Time.hours(1));


    KeyedStream<RiskEvent, String> keyedStream = dataStream.keyBy(new KeySelector<RiskEvent, String>() {
        @Override
        public String getKey(RiskEvent riskEvent) throws Exception {
            // key by user for aggregation
            return riskEvent.getEventType() + riskEvent.getDeviceFp();
        }
    });
    PatternStream<RiskEvent> eventPatternStream = CEP.pattern(keyedStream, loginPattern);

    eventPatternStream.select(new PatternSelectFunction<RiskEvent, RiskResult>() {
        @Override
        public RiskResult select(Map<String, List<RiskEvent>> map) throws Exception {
            List<RiskEvent> list = map.get("start");

            ArrayList<Long> times = new ArrayList<>();
            for (RiskEvent riskEvent : list) {
                times.add(riskEvent.getEventTime());
            }
            Long min = Collections.min(times);
            Long max = Collections.max(times);

            Set<String> accountList = list.stream().map(RiskEvent::getUserName).collect(Collectors.toSet());
            logger.info("时间范围:" + new Date(min) + " --- " + new Date(max) + " 事件:" + list.get(0).getEventType() + ", 设备指纹:" + list.get(0).getDeviceFp() + ", 关联账户:" + accountList.toString());
            return null;
        }
    });

maybe you notice that, the skip strategy skipShortOnes is a customized strategy.

Show you my modification in CEP lib.

  1. add strategy in Enum.

    public enum SkipStrategy{ NO_SKIP, SKIP_PAST_LAST_EVENT, SKIP_TO_FIRST, SKIP_TO_LAST, SKIP_SHORT_ONES }

  2. add access method in AfterMatchSkipStrategy.java

    public static AfterMatchSkipStrategy skipShortOnes() { return new AfterMatchSkipStrategy(SkipStrategy.SKIP_SHORT_ONES); }

  3. add strategy actions in discardComputationStatesAccordingToStrategy method at NFA.java.

    case SKIP_SHORT_ONES: int i = 0; List>> tempResult = new ArrayList<>(matchedResult); for (Map> resultMap : tempResult) { if (i++ == 0) { continue; } matchedResult.remove(resultMap); } break;

Brutal_JL
  • 2,839
  • 2
  • 21
  • 27