What is the appropriate way to handle data that is, by virtue of its source, rolled-up or otherwise summarised and weighted ahead of time (min, max, avg for the summarisation)?
I cannot change that behaviour, so while "Do not summarise the data" is unfortunately not going to fly.
The data would look like this:
- Data that is less than a week old: every 5 minutes,
weight = 5
- Data that is between a week and a month old: summarised into hourly data,
weight = 60
- Data that is older than that: summarised into daily data,
weight = 60
table, th, td {
border: 1px solid;
}
td {
font-family: monospace;
}
<table>
<tr>
<th>timestamp
<th>min
<th>max
<th>avg
<th>weight
<tr>
<td>2022-05-18 00:10:00
<td>1
<td>10
<td>3
<td>5
<tr>
<td>2022-05-18 00:05:00
<td>2
<td>6
<td>4
<td>5
<tr>
<td>2022-05-18 00:00:00
<td>9
<td>93
<td>44
<td>5
<tr>
<td>...
<tr>
<td>2022-05-11 02:00:00
<td>1
<td>1
<td>1
<td>60
<tr>
<td>2022-05-11 01:00:00
<td>7
<td>94
<td>23
<td>60
<tr>
<td>2022-05-11 00:00:00
<td>0
<td>0
<td>0
<td>60
<tr>
<td>...
<tr>
<td>2022-04-18 00:00:00
<td>1
<td>100
<td>50
<td>1440
<tr>
<td>2022-04-17 00:00:00
<td>10
<td>10
<td>10
<td>1440
<tr>
<td>2022-04-16 00:00:00
<td>0
<td>15
<td>10
<td>1440
Everything says "do not mix granularity" but then has a caveat of "you could use a field indicating weight". Ought one store the data for the different timeframes in a single fact table with the weights, or in multiple fact tables?
The questions that will be asked will tend to cross the boundaries of the different bands in terms of timeframe.