I'm trying to understand this situation: in my test environment (pg14 on debian) I have a table with many rows (18.000.000).
Suppose the table is "entities", and in simple terms it has this structure:
CREATE TABLE entities (
id BIGINT not null,
somefield INTEGER not null,
otherfield TIMESTAMP not null
);
create index ix1 on entities(somefield);
create index ix2 on entities(otherfield);
Then I cloned it twice, say "entities1" and "entities2" with a dumb script:
insert into entities1 select * from entities;
insert into entities2 select * from entities;
Then I ran
vacuum analyze entities2;
reindex table entities2;
and I kept entities1 untouched.
Now I have this simple query:
select extract(year from otherfield), COUNT(*) as n
from entities1
group by extract(year from otherfield);
which results in:
2019 43956
2020 5981223
2021 12172333
My query runs way faster (2x) in the untouched table than in the analyzed table.
The execution plans are these:
entities1
"Finalize GroupAggregate (cost=528695.36..528746.53 rows=200 width=40) (actual time=4774.324..4780.668 rows=3 loops=1)"
" Group Key: (EXTRACT(year FROM ""otherfield""))"
" -> Gather Merge (cost=528695.36..528742.03 rows=400 width=40) (actual time=4774.316..4780.658 rows=9 loops=1)"
" Workers Planned: 2"
" Workers Launched: 2"
" -> Sort (cost=527695.34..527695.84 rows=200 width=40) (actual time=4727.063..4727.064 rows=3 loops=3)"
" Sort Key: (EXTRACT(year FROM ""otherfield""))"
" Sort Method: quicksort Memory: 25kB"
" Worker 0: Sort Method: quicksort Memory: 25kB"
" Worker 1: Sort Method: quicksort Memory: 25kB"
" -> Partial HashAggregate (cost=527685.19..527687.69 rows=200 width=40) (actual time=4727.040..4727.041 rows=3 loops=3)"
" Group Key: EXTRACT(year FROM ""otherfield"")"
" Batches: 1 Memory Usage: 40kB"
" Worker 0: Batches: 1 Memory Usage: 40kB"
" Worker 1: Batches: 1 Memory Usage: 40kB"
" -> Parallel Seq Scan on entities1 (cost=0.00..489773.71 rows=7582297 width=32) (actual time=0.548..3302.243 rows=6065837 loops=3)"
"Planning Time: 1.309 ms"
"Execution Time: 4780.718 ms"
entities2
"Gather (cost=1000.56..4087193.67 rows=16836383 width=40) (actual time=209.129..12448.711 rows=3 loops=1)"
" Workers Planned: 1"
" Workers Launched: 1"
" Single Copy: true"
" -> GroupAggregate (cost=0.56..2402555.37 rows=16836383 width=40) (actual time=43.676..12263.595 rows=3 loops=1)"
" Group Key: EXTRACT(year FROM ""otherfield"")"
" -> Index Scan using ix2_entities2 on entities2 (cost=0.56..2101113.02 rows=18197512 width=32) (actual time=2.756..9803.865 rows=18197512 loops=1)"
"Planning Time: 0.164 ms"
"Execution Time: 12448.750 ms"
what I immediately noticed is that the first plan runs in parallel, so I forced a parallel execution (parallel_setup_cost=100000, parallel_tuple_cost = 0.001) and I obtained this:
"GroupAggregate (cost=1532983.03..2062194.15 rows=16836383 width=40) (actual time=3373.406..10682.438 rows=3 loops=1)"
" Group Key: (EXTRACT(year FROM ""otherfield""))"
" -> Gather Merge (cost=1532983.03..1760751.81 rows=18197512 width=32) (actual time=3352.633..7930.565 rows=18197512 loops=1)"
" Workers Planned: 2"
" Workers Launched: 2"
" -> Sort (cost=1432983.00..1451938.74 rows=7582297 width=32) (actual time=3276.100..3999.320 rows=6065837 loops=3)"
" Sort Key: (EXTRACT(year FROM ""otherfield""))"
" Sort Method: external merge Disk: 91728kB"
" Worker 0: Sort Method: external merge Disk: 88432kB"
" Worker 1: Sort Method: external merge Disk: 86704kB"
" -> Parallel Index Only Scan using ix2_entities2 on entities2 (cost=0.44..385130.71 rows=7582297 width=32) (actual time=1.225..1763.335 rows=6065837 loops=3)"
" Heap Fetches: 0"
"Planning Time: 0.124 ms"
"Execution Time: 10693.600 ms"
now, with both plans in parallel, I see these differences:
- seq scan on entities1 is faster than index scan on entities2, but the seq scan has to hashaggregate resulting in a more expensive subplan.
Ok, without statistics being collected, maybe the seq scan is more efficient than a non updated index scan. - the hashaggregate on entities1 says: rows = 200, which is the default estimated value.
Again maybe it's legit because no statistics are collected - the sort methods. When forced to parallel, entities2 has to perform an external merge sort on disk, caused by the size of the data (175MB on both workers). Entities1 has practically no load (50KB on both workers), and the plan chose to perform a quicksort.
Both queries return the same result, I don't understand the need of sort, but less than that, I don't understand why the ANALYZE command impact so bad on performances.
What I am missing?
Let me know if I omitted some details
Thanks in advance