0

I'm asking for help/tips with system design.

I have some iot system with sensors PIR(motion), contactrons, temperature& humidity ... Nothing fancy.

I'm collecting, filtering the raw data the data to build some observations on top. So far I have some event_rules classes that are bound to sensors and return True/False depending on the data that's coming constantly from the queue(from sensors).

I know I need to run some periodic analyses on existing data e.g when Motion sensors are not reporting anymore or both incoming/existing that includes loading the data and analyzing data in some time window (counting/average, etc.) That time window approach could help answer the questions like:

temperature increased over 10deg in last 1h* or no motion detected for past 10mins or High/low/no movement detected over last 30mins

My silly approach was to run some semi-cron python thread that executes rules one-by-one and checks the rules output every N seconds e.g every 30sec. Some rules includes a state machine and handles transitions from one state to another. But this is soo baaad imho, imagine system scales-up and all of the sudden system is going to check hundreds of rules every N...seconds.

I know some generic approach is needed. How shall I tackle the case? What is the correct approach? In the uC world I'd call it how to properly generate system clock that will check the rules, but again not all at once and in a kindla configurable manner.

I'd be thankful for the tips, maybe there are already some python libraries to address it. I'm using pandas for analyses and machine state for the state transitions, event rules are defined in SQL database and cast to polymorphic python class based on the rule type.

morf
  • 125
  • 11
  • Did I understand you correctly: You want to define a time window with a fix duration and apply that time window at your data to calculate aggregations for the included values? – squeezer44 Sep 17 '20 at 09:35
  • A bit, rather question is about how to trigger a number of functions (that check rules) periodically (analyses are run inside those function) in controllable manner. Imagine a state machine that will "check" some conditions in predefined intervals and decide if a transition shall be done. – morf Sep 17 '20 at 09:41

1 Answers1

0

Using Pandas rolling Window could be a solution (Sources: pandas.pydata.org: Window, How to use rolling in pandas?).

This meant in general:
Step 1:
Define a timebased window based either on a number of rows (increased index id) or timebased (increased timestamp)

Step 2:
Apply this window onto the dataset

Principle of a rolling window

The code snippet below applies basic calculations (mean, min, max) to a dataframe and adds the results as new columns in the dataframe.
To keep the original dataframe clean I suggest to use a copy instead of:

import pandas as pd

df = pd.read_csv('[PathToDatasurce]')
df_copy = df.copy()
        
df_copy['moving-average'] = df_15['SourceColumn'].rolling(window=10).mean()
df_copy['moving-average'] = df_15['SourceColumn'].rolling(window=10).min()
df_copy['moving-average'] = df_15['SourceColumn'].rolling(window=10).max()
squeezer44
  • 560
  • 2
  • 17
  • Thank You, my question was rather how to trigger a number of functions that do the calculations with rolling window. In other words I'm looking for some solution that will help and control task triggering – morf Sep 17 '20 at 11:30
  • I just edited my answer by a sample - so it should get more clear what I#ve meant. Does that answer your questions? – squeezer44 Sep 17 '20 at 13:39