I've a query that for each row from campaigns
gets the most recent row from effects
:
SELECT
id,
(
SELECT
created
FROM
effects
WHERE
effects.campaignid = campaigns.id
ORDER BY
effects.created DESC
LIMIT 1
) AS last_activity
FROM
campaigns
WHERE
deleted_at IS NULL
AND id in(53, 54);
To optimize the performance for this query I'm using an index effects_campaign_created_idx
over (campaignid,created)
.
Additionally, for another use case, there's the index effects_created_idx
on (created)
.
For some reason, at one moment, the subquery stopped using the correct index effects_campaign_created_idx
and started using effects_created_idx
instead, which is highly ineffective and takes ~5 minutes to run, instead of ~40ms.
Whenever I execute the internal query alone (using the same campaignid
) the correct index is used.
What could be the reason for such behavior on part of the query planner? Should I structure my query differently, so that the right index is chosen?
What are more advanced ways to debug the query planner behavior?
Here's explain analyze
results from executing the offending query:
explain analyze SELECT
id,
(
SELECT
created
FROM
effects
WHERE
effects.campaignid = campaigns.id
ORDER BY
effects.created DESC
LIMIT 1) AS last_activity
FROM
campaigns
WHERE
deleted_at IS NULL
AND id in(53, 54);
-----------------------------------------
Seq Scan on campaigns (cost=0.00..4.56 rows=2 width=12) (actual time=330176.476..677186.438 rows=2 loops=1)
" Filter: ((deleted_at IS NULL) AND (id = ANY ('{53,54}'::integer[])))"
Rows Removed by Filter: 45
SubPlan 1
-> Limit (cost=0.43..0.98 rows=1 width=8) (actual time=338593.165..338593.166 rows=1 loops=2)
-> Index Scan Backward using effects_created_idx on effects (cost=0.43..858859.67 rows=1562954 width=8) (actual time=338593.160..338593.160 rows=1 loops=2)
Filter: (campaignid = campaigns.id)
Rows Removed by Filter: 14026092
Planning Time: 0.245 ms
Execution Time: 677195.239 ms
Following advice here, tried moving to MAX(created)
instead of using the Subquery with ORDER BY created DESC LIMIT 1
. Unfortunately the results are still poor:
EXPLAIN ANALYZE SELECT
campaigns.id,
subquery.created
FROM
campaigns
LEFT JOIN (
SELECT
campaignid,
MAX(created) created
FROM
effects
GROUP BY
campaignid) subquery ON campaigns.id = subquery.campaignid
WHERE
campaigns.deleted_at IS NULL
AND campaigns.id in(53, 54);
Hash Right Join (cost=667460.06..667462.46 rows=2 width=12) (actual time=30516.620..30573.091 rows=2 loops=1)
Hash Cond: (effects.campaignid = campaigns.id)
-> Finalize GroupAggregate (cost=667457.45..667459.73 rows=9 width=16) (actual time=30251.920..30308.379 rows=23 loops=1)
Group Key: effects.campaignid
-> Gather Merge (cost=667457.45..667459.55 rows=18 width=16) (actual time=30251.832..30308.271 rows=49 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=666457.43..666457.45 rows=9 width=16) (actual time=30156.539..30156.544 rows=16 loops=3)
Sort Key: effects.campaignid
Sort Method: quicksort Memory: 25kB
Worker 0: Sort Method: quicksort Memory: 25kB
Worker 1: Sort Method: quicksort Memory: 25kB
-> Partial HashAggregate (cost=666457.19..666457.28 rows=9 width=16) (actual time=30155.951..30155.957 rows=16 loops=3)
Group Key: effects.campaignid
Batches: 1 Memory Usage: 24kB
Worker 0: Batches: 1 Memory Usage: 24kB
Worker 1: Batches: 1 Memory Usage: 24kB
-> Parallel Seq Scan on effects (cost=0.00..637166.13 rows=5858213 width=16) (actual time=220.784..28693.182 rows=4684157 loops=3)
-> Hash (cost=2.59..2.59 rows=2 width=4) (actual time=264.653..264.656 rows=2 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on campaigns (cost=0.00..2.59 rows=2 width=4) (actual time=264.612..264.640 rows=2 loops=1)
" Filter: ((deleted_at IS NULL) AND (id = ANY ('{53,54}'::integer[])))"
Rows Removed by Filter: 45
Planning Time: 0.354 ms
JIT:
Functions: 34
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 9.958 ms, Inlining 409.293 ms, Optimization 308.279 ms, Emission 206.936 ms, Total 934.465 ms
Execution Time: 30578.920 ms
Notes
- It is not the case where there is a majority of
effects
row withcampaignid
in(53,54)
. - I've reindexed and analyzed tables already.
- [edit: the index was created with
USING btree
]