1

What is the best approach to achieve a linear regression using CEP ?. We have tried two different options.

We do want to have the algorithm working in real time.

Basic code for both approach :

      create context IntervalSpanning3Seconds start @now end after 30 sec;

      create schema measure (
          temperature float,
          water float,
          _hours float,
          persons float,
          production float
      );

      @Name("gattering_measures")
      insert into measure
      select
          cast(getNumber(m,"measurement.bsk_mymeasurement.temperature.value"),     
         float) as temperature,
          cast(getNumber(m, "measurement.bsk_mymeasurement.water.value"), float) as water,
          cast(getNumber(m, "measurement.bsk_mymeasurement._hours.value"), float) as _hours,
          cast(getNumber(m, "measurement.bsk_mymeasurement.persons.value"), float) as persons,
          cast(getNumber(m, "measurement.bsk_mymeasurement.production.value"),float) as production
      from MeasurementCreated m 
      where m.measurement.type = "bsk_mymeasurement";

1. Using the function stat:linest

      @Name("get_data")
      context IntervalSpanning3Seconds
      select * from measure.stat:linest(water,production,_hours,persons,temperature)
      output snapshot when terminated;

EDIT: The problem here is that it seems like the "get_data" is getting execute by each measurement and not by the entire collection of measurement.

2. Get data and passed a javascript function.

      create expression String exeReg(data) [
          var f = f(data)

          function f(d){
             .....
             // return the linear regression as a string 
          }
          return f
      ];

      @Name("get_data")
      insert into CreateEvent
      select 
         "bsk_outcome_linear_regression" as type,
         exeReg(m) as text,
         ....
      from measure m;

EDIT: Here, I would like to know what is the type of the variable that is passed to the exeReg() function and how I should iterate it ? example would be nice.

I'll appreciate any help.

Jorge
  • 238
  • 1
  • 10
  • Best approaches would work. The "linest" seems much simpler and is probably faster. What is the question? – user650839 Jan 29 '18 at 11:51
  • @user650839 please see my edit. – Jorge Jan 29 '18 at 13:29
  • The script receives the current instance of "measure" and there is no iterate available. For getting some list of "measure" events there needs to be a data window and "window(*)". – user650839 Jan 30 '18 at 19:40

1 Answers1

1

Using JavaScript would mean that the script computes a new result (recomputes) for each collection it receives. Instead of recomputing it the #lineest data window is a good choice. Or you can add a custom aggregation function or custom data window to the engine if there is certain code your want to use. Below is how the script can receive multiple events for the case when a script is desired.

    create expression String exeReg(data) [
            .... script here...
          ];

    select exeReg(window(*)) ....
    from measure#time(10 seconds);
user650839
  • 2,594
  • 1
  • 13
  • 9