I've been doing some reading on architecture types for OLAP processing, specifically the Star Schema concept. I'm currently setting up a database for testing.
My Situation
I have 750 sensors, each will be posting data to an SQL database every minute. Currently, my approach is to post if there is a change in the value. Thinking about this a little further makes me wonder if it could pose an issue in determining if there is data loss vs an unchanged value and makes me re-consider scheduled posting.
The software which inserts the sensor's value posts a time stamp in the format 2020-01-23 13:48:52
. Along with this, the sensor name and sensor value are also posted. So, 4 columns in my fact table if you count the primary key.
I know I'll need to do some intermediary processing on my data (probably a table that is processed on a schedule to get it to conform to the schema, then dropped), but for now I'm trying to keep this high-level and tackle one issue at a time.
Here is what I kind of have envisioned currently, but I'm lost on how to handle the Time Dimension considering the resolution I need.
My Questions
"Currently, my approach is to post if there is a change in the value. Thinking about this a little further makes me wonder if it could pose an issue in determining if there is data loss vs an unchanged value and makes me re-consider scheduled posting."
Is this a valid concern to identify data loss or is there another way to address it?
Considering my needed resolution, what would a sample time dimension look like?