I have a biggish table in postgresql 15.1 — maybe 50 million rows and growing. A column mmsi
has about 30k distinct values, so 1000+ rows per mmsi
.
My problem is that I have a query that I need to execute repeatedly during DB load, and for certain values of mmsi
it takes hundreds of seconds instead of milliseconds. The model query is simply
SELECT max(to_timestamp) FROM track WHERE mmsi = :mmsi
The anomaly is visible in the EXPLAIN
output. The bad case (which only happens for a small fraction of mmsi
values):
trackdb=# EXPLAIN SELECT max(to_timestamp) FROM track WHERE mmsi = 354710000;
QUERY PLAN
----------
Result (cost=413.16..413.17 rows=1 width=8)
InitPlan 1 (returns $0)
- > Limit (cost=0.56..413.16 rows=1 width=8)
- > Index Scan Backward using ix_track_to_timestamp on track (cost=0.56..3894939.14 rows=9440 width=8)
Index Cond: (to_timestamp IS NOT NULL)
Filter: (mmsi = 354710000)
(6 rows)
Good case (the vast majority):
trackdb=# EXPLAIN SELECT max(to_timestamp) FROM track WHERE mmsi = 354710001;
QUERY PLAN
----------
Aggregate (cost=1637.99..1638.00 rows=1 width=8)
- > Index Scan using ix_track_mmsi on track (cost=0.44..1635.28 rows=1082 width=8)
Index Cond: (mmsi = 354710001)
(3 rows)
Now, I notice that the estimated number of rows is larger in the bad case. I can not see anything in the postgresql statistics (pg_stats.histogram_bounds
) to explain this.
The problem seems to change when I ANALYZE
the table, in that the specific values to trigger the problem becomes different. But anyhow, since this is needed during DB load, ANALYZE
is not a solution.
I'm stumped. Does anyone have an idea what could be happening?
[Edit: To clarify, I know ways to work around it, for example by materializing the rows before applying max
. But not understanding makes me very unhappy.]