I have a query which is generating a trend response and getting me counts of devices for various dates. The query is going over almost 500k rows. The table has almost 17.5 million records. I have partitioned the table based on id so that it can only look for a specific partition but still it is quite slow. Each partition has almost 200k records. Any idea how to improve the performance of this.
Query
select start_date,end_date,average, fail_count, warning_count,pass_count, average from
(
select generate_series(timestamp '2021-01-18 00:00:00', timestamp '2021-02-12 00:00:00', interval '1 day')::date) t(start_date)
LEFT JOIN (
SELECT start_date, end_date, avg(score) as average
, count(*) FILTER (WHERE status = 'Fail') AS fail_count
, count(*) FILTER (WHERE status = 'Warning') AS warning_count
, count(*) FILTER (WHERE status = 'Pass') AS pass_count
FROM performance.tenant_based scd join performance.hierarchy dh on dh.id = scd.id and dh.tag = scd.tag
where dh.parent_id in (0,1,2,3,4,5,6,7,8,9,10) and dh.child_id in (0,1,2,3,4,5,6,7,8,9,10) and dh.desc in ('test')
and dh.id ='ita68f0c03880e4c6694859dfa74f1cdf6' AND start_date >= '2021-01-18 00:00:00' -- same date range as above
AND start_date <= '2021-02-12 00:00:00'
GROUP BY 1,2
) s USING (start_date)
ORDER BY 1;
The Query plan is below
Sort (cost=241350.02..241850.02 rows=200000 width=104) (actual time=3453.888..3453.890 rows=26 loops=1)
Sort Key: (((((generate_series('2021-01-18 00:00:00'::timestamp without time zone, '2021-02-12 00:00:00'::timestamp without time zone, '1 day'::interval)))::date))::timestamp without time zone)
Sort Method: quicksort Memory: 28kB
-> Merge Left Join (cost=201014.95..212802.88 rows=200000 width=104) (actual time=2901.012..3453.867 rows=26 loops=1)
Merge Cond: ((((generate_series('2021-01-18 00:00:00'::timestamp without time zone, '2021-02-12 00:00:00'::timestamp without time zone, '1 day'::interval)))::date) = scd.start_date)
-> Sort (cost=79.85..82.35 rows=1000 width=4) (actual time=0.015..0.024 rows=26 loops=1)
Sort Key: (((generate_series('2021-01-18 00:00:00'::timestamp without time zone, '2021-02-12 00:00:00'::timestamp without time zone, '1 day'::interval)))::date)
Sort Method: quicksort Memory: 26kB
-> Result (cost=0.00..20.02 rows=1000 width=4) (actual time=0.003..0.009 rows=26 loops=1)
-> ProjectSet (cost=0.00..5.02 rows=1000 width=8) (actual time=0.002..0.006 rows=26 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=1)
-> Materialize (cost=200935.11..209318.03 rows=40000 width=72) (actual time=2900.992..3453.789 rows=25 loops=1)
-> Finalize GroupAggregate (cost=200935.11..208818.03 rows=40000 width=72) (actual time=2900.990..3453.771 rows=25 loops=1)
Group Key: scd.start_date, scd.end_date
-> Gather Merge (cost=200935.11..207569.38 rows=49910 width=72) (actual time=2879.365..3453.827 rows=75 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial GroupAggregate (cost=199935.08..200808.51 rows=24955 width=72) (actual time=2686.465..3228.313 rows=25 loops=3)
Group Key: scd.start_date, scd.end_date
-> Sort (cost=199935.08..199997.47 rows=24955 width=25) (actual time=2664.518..2860.477 rows=1666667 loops=3)
Sort Key: scd.start_date, scd.end_date
Sort Method: external merge Disk: 59840kB
-> Hash Join (cost=44891.11..198112.49 rows=24955 width=25) (actual time=111.653..1817.228 rows=1666667 loops=3)
Hash Cond: (scd.tag = (dh.tag)::text)
-> Append (cost=0.00..145159.33 rows=2083333 width=68) (actual time=0.006..591.818 rows=1666667 loops=3)
-> Parallel Seq Scan on ita68f0c03880e4c6694859dfa74f1cdf6 scd (cost=0.00..145159.33 rows=2083333 width=68) (actual time=0.006..455.525 rows=1666667 loops=3)
Filter: ((start_date >= '2021-01-18 00:00:00'::timestamp without time zone) AND (start_date <= '2021-02-12 00:00:00'::timestamp without time zone) AND ((id)::text = 'ita68f0c03880e4c6694859dfa74f1cdf6'::text))
-> Hash (cost=44638.71..44638.71 rows=20192 width=45) (actual time=111.502..111.502 rows=200000 loops=3)
Buckets: 65536 (originally 32768) Batches: 8 (originally 1) Memory Usage: 3585kB
-> Bitmap Heap Scan on hierarchy dh (cost=1339.01..44638.71 rows=20192 width=45) (actual time=26.542..62.078 rows=200000 loops=3)
Recheck Cond: (((id)::text = 'ita68f0c03880e4c6694859dfa74f1cdf6'::text) AND (parent_id = ANY ('{0,1,2,3,4,5,6,7,8,9,10}'::integer[])) AND (child_id = ANY ('{0,1,2,3,4,5,6,7,8,9,10}'::integer[])) AND ((desc)::text = 'test'::text))
Heap Blocks: exact=5717
-> Bitmap Index Scan on hierarchy_id_region_idx (cost=0.00..1333.96 rows=20192 width=0) (actual time=25.792..25.792 rows=200000 loops=3)
Index Cond: (((id)::text = 'ita68f0c03880e4c6694859dfa74f1cdf6'::text) AND (parent_id = ANY ('{0,1,2,3,4,5,6,7,8,9,10}'::integer[])) AND (child_id = ANY ('{0,1,2,3,4,5,6,7,8,9,10}'::integer[])) AND ((desc)::text = 'test'::text))
Planning time: 0.602 ms
Execution time: 3463.440 ms