Multi-column indices ordering by date and created_at exhibit strange behavior for different queries

Question

On postgres 10, I have a query like so, for a table with millions of rows, to grab the latest posts belonging to classrooms:

SELECT "posts".*
FROM "posts"
WHERE "posts"."school_id" = 1
  AND "posts"."classroom_id" IN (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
ORDER BY date desc, created_at desc
LIMIT 30 OFFSET 30;

Assume that classrooms only belong to one school.

I have an index like so:

t.index ["date", "created_at", "school_id", "classroom_id"], name: "optimize_post_pagination"

When I run the query, it does an index scan backwards like I'd hope and return in 0.7ms.

Limit  (cost=127336.95..254673.34 rows=30 width=494) (actual time=0.189..0.242 rows=30 loops=1)
  ->  Index Scan Backward using optimize_post_pagination on posts  (cost=0.56..1018691.68 rows=240 width=494) (actual time=0.103..0.236 rows=60 loops=1)
        Index Cond: (school_id = 1)
"        Filter: (classroom_id = ANY ('{10,11,...}'::integer[]))"
Planning time: 0.112 ms
Execution time: 0.260 ms

However, when I change the query to only include a couple classrooms:

SELECT "posts".*
FROM "posts"
WHERE "posts"."school_id" = 1
  AND "posts"."classroom_id" IN (10, 11)
ORDER BY date desc, created_at desc
LIMIT 30 OFFSET 30;

It freaks out and does a lot of extra work, taking nearly 4 sec:

  ->  Sort  (cost=933989.58..933989.68 rows=40 width=494) (actual time=3857.216..3857.219 rows=60 loops=1)
"        Sort Key: date DESC, created_at DESC"
        Sort Method: top-N heapsort  Memory: 61kB
        ->  Bitmap Heap Scan on posts  (cost=615054.27..933988.51 rows=40 width=494) (actual time=2700.871..3851.518 rows=18826 loops=1)
              Recheck Cond: (school_id = 1)
"              Filter: (classroom_id = ANY ('{10,11}'::integer[]))"
              Rows Removed by Filter: 86099
              Heap Blocks: exact=29256
              ->  Bitmap Index Scan on optimize_post_pagination  (cost=0.00..615054.26 rows=105020 width=0) (actual time=2696.385..2696.385 rows=104925 loops=1)
                    Index Cond: (school_id = 485)

What's even stranger is that if I drop the WHERE clause for school_id, both cases for classrooms (with a few or with many) runs fast with the backwards index scan.

This index cookbook suggests putting the ORDER BY index columns last, like so:

t.index ["school_id", "classroom_id", "date", "created_at"], name: "activity_page_index"

But that makes my queries slower, even though the cost is much lower.

Limit  (cost=993.93..994.00 rows=30 width=494) (actual time=208.443..208.452 rows=30 loops=1)
  ->  Sort  (cost=993.85..994.45 rows=240 width=494) (actual time=208.436..208.443 rows=60 loops=1)
"        Sort Key: date DESC, created_at DESC"
        Sort Method: top-N heapsort  Memory: 118kB
        ->  Index Scan using activity_page_index on posts  (cost=0.56..985.56 rows=240 width=494) (actual time=0.032..178.147 rows=102403 loops=1)
"              Index Cond: ((school_id = 1) AND (classroom_id = ANY ('{10,11,...}'::integer[])))"
Planning time: 0.132 ms
Execution time: 208.482 ms

Interestingly, with the activity_page_index query, it does not change its behavior when querying with fewer classrooms.

So, a few questions:

With the original query, why would the number of classrooms make such a massive difference?
Why does dropping the school_id WHERE clause make both cases run fast?
Why does dropping the school_id WHERE clause make both cases run fast, even though the index still includes school_id?
How can a high cost query finish quickly (65883 -> 0.7ms) and a lower cost query finish slower (994 -> 208ms)?

Other notes

It is necessary to order by both date and created_at, even though they may seem redundant.

score 2 · Answer 1 · answered Sep 28 '20 at 23:31

Your first plan as shown seems impossible for your query as shown. The school_id = 1 criterion should show up either as an index condition, or as a filter condition, but you don't show it in either one.

With the original query, why would the number of classrooms make such a massive difference?

With the original plan, it is getting the rows in the desired order by walking the index. Then it gets to stop early once it accumulates 60 rows which meet the non-index criteria. So the more selective those other criteria are, the most of the index it needs to walk before it gets enough rows to stop early. Removing classrooms from the list makes it more selective, so makes that plan look less attractive. At some point, it crosses a line where it looks less attractive than something else.

Why does dropping the school_id WHERE clause make both cases run fast?

You said that every classroom belongs to only one school. But PostgreSQL does not know that, it thinks the two criteria are independent, so gets the overall estimated selectivity by multiplying the two separate estimates. This gives it a very misleading estimate of the overall selectivity, which makes the already-ordered index scan look worse than it really is. Not specifying the redundant school_id prevents it from making this bad assumption about the independence of the criteria. You could create multi-column statistics to try to overcome this, but in my hands this doesn't actually help you on this query until v13 (for reasons I don't understand).

This is about the estimation process, not the execution. So school_id being in the index or not doesn't matter.

How can a high cost query finish quickly (65883 -> 0.7ms) and a lower cost query finish slower (994 -> 208ms)?

"It is difficult to make predictions, especially about the future." Cost estimates are predictions. Sometimes they don't work out very well.

Wow, I can't thank you enough for taking the time to provide this explanation. I feel like I grokked a little more about databases and indexes for the first time. Can you expand on the statement "This is about the estimation process... so school_id being in the index or not doesn't matter"? You're saying that if I drop the `school_id` from the query but leave it in the index, the estimation process isn't impacted. At that point it's just wasted index space, is that right? But if I had the inverse, where it's in the query but not in the index, that would impact performance? — Andrew Smith, Oct 01 '20 at 18:52

Multi-column indices ordering by date and created_at exhibit strange behavior for different queries

1 Answers1