0

I have this materialized view:

                                   Materialized view "public.kwh_tag"
   Column    |       Type       | Collation | Nullable | Default | Storage | Stats target | Description
-------------+------------------+-----------+----------+---------+---------+--------------+-------------
 timest      | date             |           |          |         | plain   |              |
 Zählerstand | double precision |           |          |         | plain   |              |
 kWh         | double precision |           |          |         | plain   |              |
Indexes:
    "kwh_tag_timest_idx" UNIQUE, btree (timest)
View definition:
 SELECT s.timest::date AS timest,
    s.kwh_180 AS "Zählerstand",
    s.kwh_180 - lag(s.kwh_180, 1) OVER (ORDER BY s.timest) AS "kWh"
   FROM ( SELECT DISTINCT ON ((smartmeter.timest::date)) smartmeter.timest,
            smartmeter.kwh_180
           FROM smartmeter
          ORDER BY (smartmeter.timest::date) DESC, smartmeter.timest DESC) s;

Output:

 2023-06-10 |  35965.0285 | 22.8133000000016
 2023-06-11 |   35985.919 | 20.8905000000013
 2023-06-12 |  36012.7307 | 26.8116999999984
 2023-06-13 |  36030.2164 | 17.4856999999975

The reason is, that this query takes a lot of time, as the table smartmeter stores data every second (or 2 if the device takes a bit longer).

REFRESH MATERIALIZED VIEW
Time: 94290.866 ms (01:34.291)

However, with each day the refresh of the view also takes more time, is there some better way to handle this?

I tried using CONCURRENTLY but this takes even longer, although only a little bit:

REFRESH MATERIALIZED VIEW CONCURRENTLY kwh_tag;
REFRESH MATERIALIZED VIEW
Time: 111461.192 ms (01:51.461)

Ideally only the most recent day needs to be computed, as the old values are not going to change. I could do that outside of postgres and save it back in, a in-postgres solution would be nicer however.

fsp
  • 515
  • 6
  • 21
  • "Ideally only the most recent day needs to be computed, as the old values are not going to change." Rename your current view as `public.kwh_tag_hist` and then create a new one for the day `public.kwh_tag`, after that do a query or create another view querying both views like `select * from public.kwh_tag_hist union all select * from public.kwh_tag;`and you will have your entire result. Repeat or automate this process daily. So you will be updating just the day info and reducing your processing time. – Pepe N O Jun 13 '23 at 17:51

2 Answers2

1

One alternative for this scenario is use timescaledb extension with the continuous aggregates feature.

You can create a continuous aggregation policy that will run only in the fresh data as it uses a watermark to know what was the latest run.

jonatasdp
  • 1,072
  • 6
  • 8
0

Instead of a materialized view, you could create a partitioned table. Each partition is for the values in a particular time range. Every day, you would use TRUNCATE and INSERT INTO ... SELECT ... to rebuild the latest partition.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263