I have topics in my system that store events for given entities. Now I would like to perform some analysis on the eventlog. Therefore I need to query all events belonging to some entity within a certain time period. Is there a possibility to aggregate all events of a certain key within a TimeWindow using Kafka Streams?
2 Answers
Sounds like you just want the groupByKey
method of the DSL

- 179,855
- 19
- 132
- 245
-
Okay, but how would it be possible to query the datapoints with a certain key? – jossdb Mar 15 '20 at 19:59
-
Not sure what you're expecting. Have you researched the Interactive Queries feature or KSQL? – OneCricketeer Mar 15 '20 at 23:23
-
Yes I know interactive queries, but the problem is that I was only able to get the "latest" value for some key. But how to query for the previous values for that key? – jossdb Mar 16 '20 at 10:33
-
You would store both values in the message when you group the key, not only the "other" value that's being merged in – OneCricketeer Mar 16 '20 at 12:53
-
I see. So I would use e.g. aggregate to add the oldValue to the message? – jossdb Mar 16 '20 at 13:55
It really depends how you want to setup your system, what kind of analysis you want to do and what you exactly mean by "querying"
For a one-time analysis, you might just want to do a stream.transform(...).to()
and filter on key and timestamp (context.timestamp()
is your friend) in your Transformer
and write the result into a topic. Hence, you would run this program once for some key and time range. Maybe you could even do the required analysis before writing any result, you you use WindowStore
(with duplication enabled) to buffer all data in the store).
If you want to write a program that prepares **all* data for analysis, you would use a groupBy()
(or grouByKey()
). Using windowedBy()
with TimeWindows
only works of you know the time ranges you want to group the data upfront (eg, hourly, or daily, or similar). For the aggregation itself, you could return a List<Value>
object and accumulate the corresponding records per key and window. This way, you can use IQ to get all records for a specif key and window with a single lookup.

- 59,682
- 7
- 117
- 137