I'm wondering if it's possible to cascade sliding windows into one another with Sparks Streaming.
So for example I have counts coming in every 1 second. I want to sum those for windows of 5, 15 and 30 seconds. I'm wondering if it's possible to reuse the 5 second windows result for the 15 seconds one, and the 15 seconds one for the 30 seconds.
The aim is to avoid storing the 1 second update for all the inputs, for the length of the longest window (since the granularity does not matter here). Instead we reuse Dstream with a frequency that matches the one we need.
Here's and example:
JavaPairDStream< String, Double > test = input;
JavaPairDStream< String, Double > test1 = input;
// 5s:
test = test.reduceByKeyAndWindow(new SumReducer(), new Duration(5000), new Duration(1000));
test1 = test1.reduceByKeyAndWindow(new SumReducer(), new Duration(5000), new Duration(5000));
// 15s
test = test1.reduceByKeyAndWindow(new SumReducer(), new Duration(15000), new Duration(5000));
test1 = test1.reduceByKeyAndWindow(new SumReducer(), new Duration(15000), new Duration(15000));
// 30s
test = test1.reduceByKeyAndWindow(new SumReducer(), new Duration(30000), new Duration(15000));
test.print();
I tried that but nothing gets printed.