1

Given some sensor data in a clickhouse table with values every 30 seconds but also some data gaps, I would like to resample the data to 30 seconds and then forward fill the gaps but only if the gaps is shorter than a certain limit.

(similar to method_2 in Python here: https://stackoverflow.com/a/40442467/17080022).

Raw data:

+------------+---------+-------------------+-----+
|equipment_id|sensor_id|datetime           |value|
+------------+---------+-------------------+-----+
|123         |456      |2021-02-05 09:18:46|63   |
|123         |456      |2021-02-05 09:19:07|64   |
|123         |456      |2021-02-05 09:18:12|63   |
|123         |456      |2021-02-05 09:22:11|63   |
|123         |456      |2021-02-05 09:22:38|65   |
|123         |456      |2021-02-05 09:24:22|65   |
|123         |456      |2021-02-05 09:24:53|65   |
|123         |456      |2021-02-05 09:24:58|66   |
|123         |456      |2021-02-05 09:26:38|62   |
|123         |456      |2021-02-05 09:27:13|82   |
|123         |456      |2021-02-05 09:27:52|76   |
+------------+---------+-------------------+-----+

Following code already resamples to 30 seconds, but fills all gaps with 0:

SELECT
  equipment_id,
  sensor_id,
  datetime,
  value
FROM table_name
WHERE
  datetime > '2021-01-05 09:18:00' AND
  datetime < '2021-01-05 09:28:00' AND
  sensor_id IN (456) AND
  equipment_id = 123
GROUP BY equipment_id, sensor_id, toStartOfInterval(datetime, INTERVAL 30 second) AS datetime
ORDER BY datetime WITH FILL STEP 30

Output:

+------------+------------+-------------------+-----+
|equipment_id|sensor_id   |datetime           |value|
+------------+------------+-------------------+-----+
|123         |456         |2021-01-05 09:18:30|63   |
|123         |456         |2021-01-05 09:19:00|64   |
|0           |0           |2021-01-05 09:19:30|0    |
|0           |0           |2021-01-05 09:20:00|0    |
|0           |0           |2021-01-05 09:20:30|0    |
|0           |0           |2021-01-05 09:21:00|0    |
|0           |0           |2021-01-05 09:21:30|0    |
|123         |456         |2021-01-05 09:22:00|63   |
|123         |456         |2021-01-05 09:22:30|65   |
|0           |0           |2021-01-05 09:23:00|0    |
|0           |0           |2021-01-05 09:23:30|0    |
|123         |456         |2021-01-05 09:24:00|65   |
|123         |456         |2021-01-05 09:24:30|65.5 |
|0           |0           |2021-01-05 09:25:00|0    |
|0           |0           |2021-01-05 09:25:30|0    |
|0           |0           |2021-01-05 09:26:00|0    |
|123         |456         |2021-01-05 09:26:30|62   |
|123         |456         |2021-01-05 09:27:00|82   |
|123         |456         |2021-01-05 09:27:30|76   |
+------------+------------+-------------------+-----+

My desired output with forward filled gaps up to a length of 3 rows (or 90 seconds) would be:

+------------+------------+-------------------+--------------+--------------+
|equipment_id|sensor_id   |datetime           |original_value|desired_value |
+------------+------------+-------------------+--------------+--------------+
|123         |456         |2021-01-05 09:18:30|63            |63            |
|123         |456         |2021-01-05 09:19:00|64            |64            |
|123         |456         |2021-01-05 09:19:30|0             |null          |
|123         |456         |2021-01-05 09:20:00|0             |null          |
|123         |456         |2021-01-05 09:20:30|0             |null          |
|123         |456         |2021-01-05 09:21:00|0             |null          |
|123         |456         |2021-01-05 09:21:30|0             |null          |
|123         |456         |2021-01-05 09:22:00|63            |63            |
|123         |456         |2021-01-05 09:22:30|65            |65            |
|123         |456         |2021-01-05 09:23:00|0             |65            |
|123         |456         |2021-01-05 09:23:30|0             |65            |
|123         |456         |2021-01-05 09:24:00|65            |65            |
|123         |456         |2021-01-05 09:24:30|65.5          |65.5          |
|123         |456         |2021-01-05 09:25:00|0             |65.5          |
|123         |456         |2021-01-05 09:25:30|0             |65.5          |
|123         |456         |2021-01-05 09:26:00|0             |65.5          |
|123         |456         |2021-01-05 09:26:30|62            |62            |
|123         |456         |2021-01-05 09:27:00|82            |82            |
|123         |456         |2021-01-05 09:27:30|76            |76            |
+------------+------------+-------------------+--------------+--------------+

Can this be done in clickhouse?

AJM
  • 11
  • 3

0 Answers0