0

I have a stream - I want to compare number of events in the current window with the previous window.

It can be done by keeping the number of events in the window in globalState and doing something link :

class Foo [I,O] extends ProcessWindowFunction[I,O, String, TimeWindow] {
  override def process(key: String, context: Context, elements: Iterable[I], out: Collector[O]): Unit = {
    val state = context.globalState.getState(windowStateDescriptor)
if (state.value != null) {
 if(state.value > elements.size) {
 // do some out.collect
  } else {
  state.update(elements.size)
   }
 }   
 }
}

however I am trying to avoid keeping the persistent state. is there a better more idiomatic way to achieve that ?

igx
  • 4,101
  • 11
  • 43
  • 88
  • Could You tell a little more on what are You trying to achieve ? – Dominik Wosiński May 26 '19 at 11:13
  • @DominikWosiński sure I am trying to compare the ratio between two windows and to raise a flag accordingly, e.g if the number of events in the current window is more than twice comparing to the previous one then I should raise some flag. – igx May 26 '19 at 11:28
  • Ok, but do You want to use keyed window for this ? Do You only want to raise a flag if the second window for given key has more elements than the first window for this key or the flag should be raised if ANY window has more elements than ANY previous window. – Dominik Wosiński May 26 '19 at 11:31
  • @DominikWosiński yes, this is a keyed stream. I want to raise a flag if the second window for given key has more elements than the first window for this key – igx May 26 '19 at 11:39
  • So You can simply use a `keyedState` instead of `windowState`. You can simply keep the number of elements in previous window and update it for each window. It is probably the fastest and most reliable way to do this. – Dominik Wosiński May 26 '19 at 14:43
  • @DominikWosiński I replaced it with a globalState which is not scoped by the window. I was just wondering if there is a better option – igx May 26 '19 at 16:49
  • I think using state in this case is probably the best and the most reliable option. Other thing which You could possibly do is returning the number of elements from the `ProcessFunction` and then using Flink CEP to detect the pattern. – Dominik Wosiński May 26 '19 at 17:06

0 Answers0