I have a rather common time-series set up:
- A TimescaleDB Hypertable (data) tracking some metrics over time. The columns are: timestamp, identifier, metric.
- The Hypertable has a continuous aggregation table (aggregated_data) that aggregates the metric on an hourly basis by storing several statistical values (avg, min, max).
- Only the aggregated metrics will be shown to the user
I would like to introduce the possibility to manually consolidate the metrics (e.g. edit outliers, fill gaps, ...). These consolidated values will then be shown to the user together with the measured metrics.
What is the TimescaleDB way to achieve this?
Here's what I came up with:
- Create a regular table (consolidated_data) for the manually consolidated data
- This table will have the same columns as the aggregated_data view
- I would then UNION these two tables and add some SQL to avoid duplicates, identify the consolidated data, ...
The UNION operation would then lokk similar to this (I did not actually run/test this statement):
SELECT DISTINCT timestamp, identifier FROM (
SELECT timestamp, identifier, metric, true as consolidated FROM consolidated_data WHERE timestamp > NOW() - interval '7 days'
UNION ALL
SELECT timestamp, identifier, metric, false as consolidated FROM aggregated_data WHERE timestamp > NOW() - interval '7 days'
)
ORDER BY timestamp;
I'd expect this approach to work, however there's still a lot of unknowns:
- Should the consolidated table be a Hypertable? Would this speed up the UNION operation?
- Should I prefer MERGE?
- Is there a smarter way with tools and utilities I do not yet know of?