I have a really simple query I'm running on a large table (500k rows) to page results.
Originally I was using this query, which is really fast:
select * from deck
order by
deck.sas_rating desc
limit 10
Its explain analyze show a 0.2ms execution time. Cool.
But the sas_rating
column has duplicate integer values, and I realized when paging through the results (using offset for other pages) that I was getting duplicate results. No problem, add the primary key as a secondary order by. But the performance is terrible.
select * from deck
order by
deck.sas_rating desc,
deck.id asc
limit 10
That takes 685ms with an explain analyze of:
Limit (cost=164593.15..164593.17 rows=10 width=1496) (actual time=685.138..685.139 rows=10 loops=1)
-> Sort (cost=164593.15..165866.51 rows=509343 width=1496) (actual time=685.137..685.137 rows=10 loops=1)
Sort Key: sas_rating DESC, id
Sort Method: top-N heapsort Memory: 59kB
-> Seq Scan on deck (cost=0.00..153586.43 rows=509343 width=1496) (actual time=0.009..593.444 rows=509355 loops=1)
Planning time: 0.143 ms
Execution time: 685.171 ms
It's even worse on my weaker production server. My search went from 125ms total to 35 seconds!
I tried adding a multi-column index, but that didn't improve performance. Is there any way to prevent duplicate results when using limit + offset without destroying the performance of the query?