I'm looking for an efficient algorithm for matching patterns / sequences in a [large] list of data. Given some type:
class Data implements Event {
int valueA;
int valueB;
int valueC;
long timestamp;
...
}
I would like to match situations such as the following.
valueA == 1 for 10 seconds
then valueB == 2 for 10 seconds
then valueC == 3 for 10 seconds.
I've implemented a very basic state machine to match this kind of pattern, which works very well and has a very acceptable throughput. However, If I want to add additional time constraints, e.g. Second pattern must occur X seconds after first
a : valueA == 1 for 10 seconds
b : valueB == 2 for 10 seconds [ 10 seconds after a ]
c : valueC == 3 for 10 seconds [ 10 seconds after b ]
The state machine concept no longer seems appropriate as it will be necessary to evaluate (and re-evaluate already matched conditions) and store the states in memory then attempt to correlate them. There will be around 1000 "rules" of this type in the system.
** EDIT **
To clarify, if I was trying to match a sequence such as the following:
x changed to 1
1 second passed
y changed to 1
3 seconds passed
z changed to 1
And given the input data:
[ x=0, y=0, z=0, t=0 sec ]
[ x=1, y=0, z=0, t=1 sec ]
[ x=0, y=1, z=0, t=2 sec ]
[ x=1, y=0, z=0, t=3 sec ]
[ x=0, y=1, z=0, t=4 sec ]
[ x=1, y=0, z=0, t=5 sec ]
[ x=0, y=1, z=0, t=6 sec ]
[ x=0, y=0, z=1, t=7 sec ]
I would expect this to reach it's final state at t=7. But, the only way I can see to do this is to store all of the other state transitions?
** END EDIT **
I've previously used Rule Engine with CEP support to match these kind of conditions, which works reasonably well but isn't able to to handle the large amounts of data required (several hundred thousand events per second).
Is there an efficient way to solve this problem? I'm using Java.
Thanks