0

I am new to TimeScaleDB, I have created a continuous aggregate view as

CREATE MATERIALIZED VIEW minute_data 
WITH (timescaledb.continuous)
AS
SELECT
   time_bucket('1 min', time_stamp) as bucket,
   thing_key,
   avg(pulse_l) as avg_pulse_l,
   avg(pulse_h) as avg_pulse_h,
   max(pulse_l) as max_pulse_l,
   max(pulse_h) as max_pulse_h,
   min(pulse_l) as min_pulse_l,
   min(pulse_h) as min_pulse_h,
   count(thing_key) as counts,
   sum(pulse_l) as sum_pulse_l,
   sum(pulse_h) as sum_pulse_h
FROM
 water_meter
GROUP BY thing_key, bucket
WITH NO DATA;

I have created refresh a policy for this view as:

SELECT add_continuous_aggregate_policy('minute_data',
  start_offset => INTERVAL '1 day',
  end_offset => INTERVAL '1 hour',
  schedule_interval => INTERVAL '1 minute');

Even after setting the end offset to 1 hour I am getting the latest added data in the aggregate.

Refresh LOGS:

 2021-07-14 08:09:39.597 UTC [1268] LOG:  refreshing continuous aggregate "minute_data" in window [ 2021-07-13 08:10:00+00, 2021-07-14 07:09:00+00 ]

But the materialized view contains the data outside the end offset, Although start offset setting is working perfectly

below is the data in materialized view: Materialized view

EDIT: Also I am getting the latest data even before the policy refreshes.

Swapnil
  • 15
  • 4
  • How do you query the continuous aggregate for data? As you didn't disable your real time aggregation, your query might read the latest data directly from the hypertable and join with materialized data. – k_rus Jul 17 '21 at 19:13

2 Answers2

0

The policy above will run every minute (schedule_interval). When it runs, it will materialize data from between 1 day (start_offset) and 1 hour (end_offset) of the time it executes, according to the query which defined the continuous aggregate.

Even after setting the end offset to 1 hour I am getting the latest added data in the aggregate.

Refreshing the information every minute will make the latest information available in the view every one minute. I see the log time is 08:09:39 and your latest data is from 08:08:00. What is inconsistent there?

Would you mind sharing a bit more details of your expectations here?

jonatasdp
  • 1,072
  • 6
  • 8
  • Thanks for replying. But `08:09:39` is the time at which the policy has refreshed for the time window- `2021-07-13 08:10:00+00, 2021-07-14 07:09:00+00` so the latest data for time- `08:08:00` should not get materialized as it is outside this window. – Swapnil Jul 15 '21 at 05:21
0

Now I understand what you mean. And, this is the expected behavior of the query as it live-aggregates all the results. You can watch the results even if the bucket is still open.

Keep in mind that you can do some simple comparison with your actual time to know if the candle is open or not.

Let's try to explore it:

tsdb=> select now();
┌───────────────────────────────┐
│              now              │
├───────────────────────────────┤
│ 2021-07-16 12:50:48.471078+00 │
└───────────────────────────────┘
(1 row)

tsdb=> select time_bucket('1 min', now());
┌────────────────────────┐
│      time_bucket       │
├────────────────────────┤
│ 2021-07-16 12:51:00+00 │
└────────────────────────┘
(1 row)
                                                         ^
tsdb=> select time_bucket('1 min', now()) < now() - interval '1 min' as closed_bucket;
┌───────────────┐
│ closed_bucket │
├───────────────┤
│ f             │
└───────────────┘
(1 row)

Probably you can use some extra column with a similar expression to filter the only closed_bucket.

jonatasdp
  • 1,072
  • 6
  • 8
  • I see maybe this post can be useful to get it: https://blog.timescale.com/blog/achieving-the-best-of-both-worlds-ensuring-up-to-date-results-with-real-time-aggregation/ – jonatasdp Jul 16 '21 at 12:58