PostgreSQL 13 - Improve huge table data aggregation

Question

I have a huge database (current size is ~900GB and new data still comes) partitioned by Year_month and subpartition by currency. The problem is when I try to fetch aggregation from the whole partition it goes slow. This is a report so it will be queried very often. The current size of partition which I want to aggregate: 7.829.230 rows. Each subpartition will be similar. Table schema (anonymized):

CREATE TABLE aggregates_dates
(
    id                    char(1)            DEFAULT '' NOT NULL,
    date                  TIMESTAMP(0)                               NOT NULL,
    currency              CHAR(3)                                    NOT NULL,
    field01               INTEGER                                    NOT NULL,
    field02               INTEGER                                    NOT NULL,
    field03               INTEGER                                    NOT NULL,
    field04               INTEGER                                    NOT NULL,
    field05               INTEGER                                    NOT NULL,
    field06               CHAR(2)                                    NOT NULL,
    field07               INTEGER         DEFAULT 0                  NOT NULL,
    field08               INTEGER         DEFAULT 0                  NOT NULL,
    field09               INTEGER         DEFAULT 0                  NOT NULL,
    field10               INTEGER         DEFAULT 0                  NOT NULL,
    field11               INTEGER         DEFAULT 0                  NOT NULL,
    value01               INTEGER         DEFAULT 0                  NOT NULL,
    value02               INTEGER         DEFAULT 0                  NOT NULL,
    value03               INTEGER         DEFAULT 0                  NOT NULL,
    value04               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,
    value05               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,
    value06               INTEGER         DEFAULT 0                  NOT NULL,
    value07               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,
    value08               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,
    value09               INTEGER         DEFAULT 0                  NOT NULL,
    value10               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,
    value11               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,
    value12               INTEGER         DEFAULT 0                  NOT NULL,
    value13               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,
    value14               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,
    value15               INTEGER         DEFAULT 0                  NOT NULL,
    value16               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,
    value17               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,
    value18               NUMERIC(24, 12) DEFAULT '0'::NUMERIC       NOT NULL,
    value19               INTEGER         DEFAULT 0,
    value20               INTEGER         DEFAULT 0,
    CONSTRAINT aggregates_dates_pkey
        PRIMARY KEY (id, date, currency)
)
    PARTITION BY RANGE (date);
CREATE TABLE aggregates_dates_2020_01
    PARTITION OF aggregates_dates
        FOR VALUES FROM ('2020-01-01 00:00:00') TO ('2020-01-31 23:59:59')
    PARTITION BY LIST (currency);
CREATE TABLE aggregates_dates_2020_01_eur
    PARTITION OF aggregates_dates_2020_01
        FOR VALUES IN ('EUR');
CREATE INDEX aggregates_dates_2020_01_eur_date_idx ON aggregates_dates_2020_01_eur (date);
CREATE INDEX aggregates_dates_2020_01_eur_field01_idx ON aggregates_dates_2020_01_eur (field01);
CREATE INDEX aggregates_dates_2020_01_eur_field02_idx ON aggregates_dates_2020_01_eur (field02);
CREATE INDEX aggregates_dates_2020_01_eur_field03_idx ON aggregates_dates_2020_01_eur (field03);
CREATE INDEX aggregates_dates_2020_01_eur_field04_idx ON aggregates_dates_2020_01_eur (field04);
CREATE INDEX aggregates_dates_2020_01_eur_field06_idx ON aggregates_dates_2020_01_eur (field06);
CREATE INDEX aggregates_dates_2020_01_eur_currency_idx ON aggregates_dates_2020_01_eur (currency);
CREATE INDEX aggregates_dates_2020_01_eur_field09_idx ON aggregates_dates_2020_01_eur (field09);
CREATE INDEX aggregates_dates_2020_01_eur_field10_idx ON aggregates_dates_2020_01_eur (field10);
CREATE INDEX aggregates_dates_2020_01_eur_field11_idx ON aggregates_dates_2020_01_eur (field11);
CREATE INDEX aggregates_dates_2020_01_eur_field05_idx ON aggregates_dates_2020_01_eur (field05);
CREATE INDEX aggregates_dates_2020_01_eur_field07_idx ON aggregates_dates_2020_01_eur (field07);
CREATE INDEX aggregates_dates_2020_01_eur_field08_idx ON aggregates_dates_2020_01_eur (field08);

Example Query (not all fields used) which aggregate whole partition (This query might have many more WHERE conditions but this one is the worst case)

EXPLAIN (ANALYSE, BUFFERS, VERBOSE) SELECT
       COALESCE(SUM(mainTable.value01), 0)            AS                                    "value01",
       COALESCE(SUM(mainTable.value02), 0)       AS                                    "value02",
       COALESCE(SUM(mainTable.value03), 0)       AS                                    "value03",
       COALESCE(SUM(mainTable.value06), 0)       AS                                    "value06",
       COALESCE(SUM(mainTable.value09), 0)    AS                                    "value09",
       COALESCE(SUM(mainTable.value12), 0)      AS                                    "value12",
       COALESCE(SUM(mainTable.value15), 0) AS                                    "value15",
       COALESCE(SUM(mainTable.value03 + mainTable.value06 + mainTable.value09 + mainTable.value12 +
                    mainTable.value15), 0) AS                                    "kpi01",
       COALESCE(SUM(mainTable.value05) * 1, 0)                                         "value05",
       COALESCE(SUM(mainTable.value08) * 1, 0)                                         "value08",
       COALESCE(SUM(mainTable.value11) * 1, 0)                                      "value11",
       COALESCE(SUM(mainTable.value14) * 1, 0)                                        "value14",
       COALESCE(SUM(mainTable.value17) * 1, 0)                                   "value17",
       COALESCE(SUM(mainTable.value05 + mainTable.value08 + mainTable.value11 + mainTable.value14 +
                    mainTable.value17) * 1, 0)                                   "kpi02",
       CASE
           WHEN SUM(mainTable.value02) > 0 THEN (1.0 * SUM(
                       mainTable.value05 + mainTable.value08 + mainTable.value11 +
                       mainTable.value14 + mainTable.value17) / SUM(mainTable.value02) * 1000 * 1)
           ELSE 0 END                                                                      "kpiEpm",
       CASE
           WHEN SUM(mainTable.value01) > 0 THEN (1.0 * SUM(
                       mainTable.value05 + mainTable.value08 + mainTable.value11 +
                       mainTable.value14) / SUM(mainTable.value01) * 1)
           ELSE 0 END
FROM aggregates_dates mainTable
WHERE (mainTable.date BETWEEN '2020-01-01 00:00:00' AND '2020-02-01 00:00:00')
  AND (mainTable.currency = 'EUR')
GROUP BY mainTable.field02;

EXPLAIN:

+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|QUERY PLAN                                                                                                                                                                          |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|HashAggregate  (cost=3748444.51..3748502.07 rows=794 width=324) (actual time=10339.771..10340.497 rows=438 loops=1)                                                                 |
|  Group Key: maintable.field02                                                                                                                                                      |
|  Batches: 1  Memory Usage: 1065kB                                                                                                                                                  |
|  Buffers: shared hit=2445343                                                                                                                                                       |
|  ->  Append  (cost=0.00..2706608.65 rows=11575954 width=47) (actual time=212.934..4549.921 rows=7829230 loops=1)                                                                   |
|        Buffers: shared hit=2445343                                                                                                                                                 |
|        ->  Seq Scan on aggregates_2020_01 maintable_1  (cost=0.00..2646928.38 rows=11570479 width=47) (actual time=212.933..4055.104 rows=7823923 loops=1)                        |
|              Filter: ((date >= '2020-01-01 00:00:00'::timestamp without time zone) AND (date <= '2020-02-01 00:00:00'::timestamp without time zone) AND (currency = 'EUR'::bpchar))|
|              Buffers: shared hit=2444445                                                                                                                                           |
|        ->  Index Scan using aggregates_2020_02_date_idx on aggregates_2020_02 maintable_2  (cost=0.56..1800.50 rows=5475 width=47) (actual time=0.036..6.476 rows=5307 loops=1)  |
|              Index Cond: ((date >= '2020-01-01 00:00:00'::timestamp without time zone) AND (date <= '2020-02-01 00:00:00'::timestamp without time zone))                           |
|              Filter: (currency = 'EUR'::bpchar)                                                                                                                                    |
|              Rows Removed by Filter: 31842                                                                                                                                         |
|              Buffers: shared hit=898                                                                                                                                               |
|Planning Time: 0.740 ms                                                                                                                                                             |
|JIT:                                                                                                                                                                                |
|  Functions: 15                                                                                                                                                                     |
|  Options: Inlining true, Optimization true, Expressions true, Deforming true                                                                                                       |
|  Timing: Generation 4.954 ms, Inlining 14.249 ms, Optimization 121.115 ms, Emission 77.181 ms, Total 217.498 ms                                                                    |
|Execution Time: 10345.662 ms                                                                                                                                                        |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Server spec:

AMD 64 Threads
315GB Ram
6xSSD RAID 10 Postgres Config:

postgresql_autovacuum_vacuum_scale_factor: 0.4
postgresql_checkpoint_completion_target: 0.9
postgresql_checkpoint_timeout: 10min
postgresql_effective_cache_size: 240GB
postgresql_maintenance_work_mem: 2GB
postgresql_random_page_cost: 1.0
postgresql_shared_buffers: 80GB
postgresql_synchronous_commit: local
postgresql_work_mem: 1GB

[Updated 2021-04-27]

I've updated the server configuration:

postgresql_max_worker_processes: 64
postgresql_max_parallel_workers_per_gather: 32
postgresql_max_parallel_workers: 64
postgresql_max_parallel_maintenance_workers: 4

For whole query I have as an my own example on production data (which is much longer - aggregates on all table fields) doesn't work faster and don't use parallel (to big select statement?). But when I reduce the number of aggregations on SELECT it starts using parallel and improves a loot performance. But when I revert back query to the original it doesn't use parallel.

Tens of millions of rows don't make a table "huge" ... with the proper indexes any SQL server can filter and aggregate them reasonably efficiently. Many hundreds of millions? that's huge. — O. Jones, Apr 26 '21 at 11:59
Your set-up code throws multiple errors. Then you query a different table than the one you just created. And then the execution plan you show matches neither the set-up code, nor the query itself. If you want to anonymize things, that's fine. But you have to do it correctly, or we will have no idea what you are actually trying to do. — jjanes, Apr 26 '21 at 16:36
@jjanes I fixed Schema creation and select. Analyze is from the real database with data. — Pyton, Apr 26 '21 at 17:07

O. Jones · Answer 1 · 2021-04-26T11:56:27.740

You need a composite BTREE index on performance(currency, date, field02) to help satisfy this particular query efficiently.

Your query filters by equality on currency, by range on date, and then groups by field02. So postgreSQL can random-access this index to the first eligible row and then scan it sequentially to get the date range and do the grouping.

Pro tip: date BETWEEN '2020-01-01 00:00:00' AND '2020-02-01 00:00:00' means

     date >= '2020-01-01 00:00:00'  AND date <= '2020-02-01 00:00:00'

I think you may want (note the < in place of <=)

     date >= '2020-01-01 00:00:00'  AND date < '2020-02-01 00:00:00'

Pro tip: Avoid creating lots of single-column indexes unless you know you need them to satisfy queries or enforce uniqueness constraints like primary keys. Instead, create the compound indexes you need to satisfy your queries. Indexes show down INSERT and UPDATE operations, so if they don't help SELECT operations they're worse than useless. If you have different WHERE filters, you may need different indexes to satisfy your queries.) postgreSQL has a good explanation of these so-called covering indexes.

(Your server is nice. Once you get your indexes right, you may find that it is overprovisioned and costing you more than it needs to.)

About Btree. Query is just one example from many. There might be more conditions and more groups so creating indexes for all cases is a little hard work :) ProTip1: Yea I know how it behaves. Dates are just an example and range might be wider eg half of the ear, a year and so on. ProTip2: I know how indexes works. We don't have a problem with insers, we have problem with Select. — Pyton, Apr 26 '21 at 17:09

score 0 · Answer 2 · answered Apr 26 '21 at 20:29

This is a report so it will be queried very often.

This statement is confusing. Transaction processing queries are usually run very often. Reports on the other hand are usually run quite seldom--once a year, once a month, once a day, maybe once a shift.

7,829,230 is a lot of rows to be aggregating over, particularly with so many different aggregates being computed. Parallel query should be able to speed this up. And in my hands it does. It is hard to see why it doesn't work for you in v13, unless you changed your config in un-shown ways which prevent it from working. Does parallel query work in general?

Your tables don't actually seem to be following the partitioning scheme you lay out, as neither of the tables involved in the actual plan includes a currency name as part of the table's name. Also, your time boundaries don't match between the queries and the partitions, as the exact moment of 2020-02-01 00:00:00 is in a different partition than all the rest of the times that satisfy your query. But this doesn't matter, as doesn't take much time to pull in those extra rows, and it is the aggregation itself which takes most of the time, and the seq scan is apparently no discarded any rows based on filters so there is nothing to optimize there.

If you really want to change up the performance in a big way here, what you could do is change to daily partitioning, and then precompute daily aggregates. Then your queries would have to aggregate only intraday data for the most recent day, and combine that with the precomputed aggregates for prior days. You would have to manually construct such queries, though. Unless one of the scaling/sharding extensions/forks can do that for you.

PostgreSQL 13 - Improve huge table data aggregation

2 Answers2