2

I'm new at the CEP 2.1 and my question is related to time-frame that tha CEP hold on to the input stream

let say that you regularly send data to some input stream let's say "HELLOSTREAM". for how long does the CEP save the inputs to the stream what is the max time etc...

let say if I send data every day for 365 days will I get back all data on the 366 day or will he truncate the data at some point( will hold ony last 100 days) ? no matter what time-window I set in the query ?

is there a limit ?

2 Answers2

1

CEP is a real-time processing server. It is used to find pre-defined pattern in real-time and for realtime monitoring. It keeps the data in memory and process the events but you can persist the data to cassandra for distributed processing...

Here data will be keep in the memory based on the window size that you defined, It depends on the window type that you are using and time or length given to that window... If you are not using any window it will not keep any data in memory...

If you want to store data for 365 days or 100 days, then it is not a real-time use-case. For that you have to use offline processing server like BAM.

Mohanadarshan
  • 830
  • 5
  • 5
  • hi thanks for the answer. well we were thinking let say of a scenario login you have data of prevision logins for a user in let say last 100 days and when new login request comes to compare if it is from same ip etc... You think that CEP is not ok for that ? ( let's say it send a warning to someone that maybe the account was hacked or someting ) this is wery basic example the same cam be using credit cards where they were used etc.... – user2656744 Aug 07 '13 at 05:43
  • I think Rajeev's answer gives the solution for your use-case... This support will be provided using event tables from CEP 3.0.0. – Mohanadarshan Aug 07 '13 at 06:27
1

To add to @Mohanadarshan's answer, if what you want is to extract and store some values from a fast event stream over a long period, a better CEP-based approach will be to use persistent event tables (which will be included in the upcoming CEP version 3.0.0 which will be released soon). This way you'll be able to do real-time processing against some extracted and persisted data. But as @Mohanadarshan has mentioned, if your requirement is batch processing (and if you do not need to detect anything real-time), WSO2 BAM will be a better option.

Using sliding windows over a very long period to store large amounts of data is not a good idea as they are stored in memory and also you'll loose data if server goes down.

Rajeev Sampath
  • 2,739
  • 1
  • 16
  • 19
  • Regarding your questions on whether the window truncates events - the answer is no. The time window will hold all the data according to the given period. If you are using a time window over a long period and want to ensure data is preserved over server crashes, one option will be to enable snapshots. This will persist window's state time to time so that all events accumulated until last snapshot operation can be retained. – Rajeev Sampath Aug 07 '13 at 07:08