0

I'm thinking about designing an event processing system. The rules per se are not the problem. What bogs my is how to store event data so that I can efficiently answer questions/facts like:

If number of events of type A in the last 10 minutes equals N, and the average events of type B per minute over the last M hours is Z, and the current running average of another metric is Y... then fire some event (or store a new fact/event).

How do Esper/Drools/MS StreamInsight store their time dependant data so that they can efficiently calculate event stream properties? ¿Do they just store it in SQL databases and continuosly query them?

Do the preprocess the rules so they can know beforehand what "knowledge" they need to store?

Thanks

EDIT: I found what I want is called Event Stream Processing, and the wikipedia example shows what I would like to do:

WHEN Person.Gender EQUALS "man" AND Person.Clothes EQUALS "tuxedo"
FOLLOWED-BY
  Person.Clothes EQUALS "gown" AND
  (Church_Bell OR Rice_Flying)
WITHIN 2 hours
ACTION Wedding

Still the question remains: how do you implement such a data store? The key is "WITHIN 2 hours" and the ability to process thousands of events per second.

marianov
  • 875
  • 1
  • 9
  • 19

2 Answers2

0

Esper analyzes the rule and only stores derived state (aggregations etc., if any) and if needed by the rule also a subset of events. Esper allows defining contexts like described in the book by Opher Etzion and Peter Niblet. I recommend reading. By specifying a context Esper can minimize the amount of state it retains and can make queries easier to read.

user650839
  • 2,594
  • 1
  • 13
  • 9
  • So the answer is that Esper parses all the rules beforehand, so it can know what state to retain? Do you know what type of data structure it uses to store state? – marianov Dec 16 '14 at 21:23
  • The data structure depends on what state is need to retain. There are around 30 or so data windows, for example, and around 30 or so aggregations, and patterns and so on. Each have their own data structure. Esper documentation can be found at http://esper.codehaus.org/esper-5.1.0/doc/reference/en-US/html_single/index.html – user650839 Dec 16 '14 at 22:03
0

It's not difficult to store events happening within a time window of a certain length. The problem gets more difficult if you have to consider additional constraints: here an analysis of the rules is indicated so that you can maintain sets of events matching the constraints.

Storing events in an (external) database will be too slow.

laune
  • 31,114
  • 3
  • 29
  • 42
  • I think that is where Esper has an advantage over Drools, in the EPL is designed to make it explicit what events are retained if any. Ie. if the Esper EPL does not declare a data window, Esper does not retain events (exceptions: patterns and SQL-standard match-recognize) – user650839 Dec 16 '14 at 21:00
  • Can you point to some documentation about how you specify the "data window" for each rule? – marianov Dec 16 '14 at 21:24
  • @marianov Get the Drools documentation of a 6.x release. See "window:time" and other details of event processing. – laune Dec 17 '14 at 06:24
  • A fun (to me at least) exception to the "external database will be too slow" comment is a column based database such as KDB+ (http://kx.com/kdb-plus.php ) which is quite popular on trading floors for evaluating price changes over time. Learning q is not fun though. :) – Steve Dec 17 '14 at 09:08