PostgreSQL slow order

Question

I have table (over 100 millions records) on PostgreSQL 13.1

CREATE TABLE report
(
    id     serial primary key,
    license_plate_id integer,
    datetime timestamp
);

Indexes (for test I create both of them):

create index report_lp_datetime_index on report (license_plate_id, datetime);
create index report_lp_datetime_desc_index on report (license_plate_id desc, datetime desc);

So, my question is why query like

select * from report r
where r.license_plate_id in (1,2,4,5,6,7,8,10,15,22,34,75)
order by datetime desc
limit 100

Is very slow (~10sec). But query without order statement is fast (milliseconds).

Explain:

explain (analyze, buffers, format text) select * from report r
where r.license_plate_id in (1,2,4,5,6,7,8,10,15,22,34, 75,374,57123)
limit 100
Limit  (cost=0.57..400.38 rows=100 width=316) (actual time=0.037..0.216 rows=100 loops=1)
  Buffers: shared hit=103
  ->  Index Scan using report_lp_id_idx on report r  (cost=0.57..44986.97 rows=11252 width=316) (actual time=0.035..0.202 rows=100 loops=1)
        Index Cond: (license_plate_id = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75,374,57123}'::integer[]))
        Buffers: shared hit=103
Planning Time: 0.228 ms
Execution Time: 0.251 ms


explain (analyze, buffers, format text) select * from report r
where r.license_plate_id in (1,2,4,5,6,7,8,10,15,22,34,75,374,57123)
order by datetime desc
limit 100
Limit  (cost=44193.63..44193.88 rows=100 width=316) (actual time=4921.030..4921.047 rows=100 loops=1)
  Buffers: shared hit=11455 read=671
  ->  Sort  (cost=44193.63..44221.76 rows=11252 width=316) (actual time=4921.028..4921.035 rows=100 loops=1)
        Sort Key: datetime DESC
        Sort Method: top-N heapsort  Memory: 128kB
        Buffers: shared hit=11455 read=671
        ->  Bitmap Heap Scan on report r  (cost=151.18..43763.59 rows=11252 width=316) (actual time=54.422..4911.927 rows=12148 loops=1)
              Recheck Cond: (license_plate_id = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75,374,57123}'::integer[]))
              Heap Blocks: exact=12063
              Buffers: shared hit=11455 read=671
              ->  Bitmap Index Scan on report_lp_id_idx  (cost=0.00..148.37 rows=11252 width=0) (actual time=52.631..52.632 rows=12148 loops=1)
                    Index Cond: (license_plate_id = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75,374,57123}'::integer[]))
                    Buffers: shared hit=59 read=4
Planning Time: 0.427 ms
Execution Time: 4921.128 ms

score 1 · Answer 1 · answered Feb 01 '21 at 04:40

You seem to have rather slow storage, if reading 671 8kB-blocks from disk takes a couple of seconds.

The way to speed this up is to reorder the table in the same way as the index, so that you can find the required rows in the same or adjacent table blocks:

CLUSTER report_lp_id_idx USING report_lp_id_idx;

Be warned that rewriting the table in this way causes downtime – the table will not be available while it is being rewritten. Moreover, PostgreSQL does not maintain the table order, so subsequent data modifications will cause performance to gradually deteriorate, so that after a while you will have to run CLUSTER again.

But if you need this query to be fast no matter what, CLUSTER is the way to go.

bobflux · Answer 2 · 2021-02-01T12:17:37.673

Your two indices do exactly the same thing, so you can remove the second one, it's useless.

To optimize your query, the order of the fields inside the index must be reversed:

create index report_lp_datetime_index on report (datetime,license_plate_id);


BEGIN;
CREATE TABLE foo (d INTEGER, i INTEGER);
INSERT INTO foo SELECT random()*100000, random()*1000 FROM generate_series(1,1000000) s;
CREATE INDEX foo_d_i ON foo(d DESC,i);
COMMIT;
VACUUM ANALYZE foo;
EXPLAIN ANALYZE SELECT * FROM foo WHERE i IN (1,2,4,5,6,7,8,10,15,22,34,75) ORDER BY d DESC LIMIT 100;

 Limit  (cost=0.42..343.92 rows=100 width=8) (actual time=0.076..9.359 rows=100 loops=1)
   ->  Index Only Scan Backward using foo_d_i on foo  (cost=0.42..40976.43 rows=11929 width=8) (actual time=0.075..9.339 rows=100 loops=1)
         Filter: (i = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75}'::integer[]))
         Rows Removed by Filter: 9016
         Heap Fetches: 0
 Planning Time: 0.339 ms
 Execution Time: 9.387 ms

Note the index is not used to optimize the WHERE clause. It is used here as a compact and fast way to store references to the rows ordered by date DESC, so the ORDER BY can do an index-only scan and avoid sorting. By adding column id to the index, an index-only scan can be performed to test the condition on id, without hitting the table for every row. Since there is a low LIMIT value it does not need to scan the whole index, it only scans it in date DESC order until it finds enough rows satisfying the WHERE condition to return the result.

It will be faster if you create the index in date DESC order, this could be useful if you use ORDER BY date DESC + LIMIT in other queries too.

You forget that OP's table has a third column, and he is using SELECT *. So that wouldn't be an index-only scan.

Easy to work around. The optimum way to do this query would be an index-only scan to filter on WHERE conditions, then LIMIT, then hit the table to get the rows. For some reason if "select *" is used postgres takes the id column from the table instead of taking it from the index, which results in lots of unnecessary heap fetches for rows whose id is rejected by the WHERE condition.

Easy to work around, by doing it manually. I've also added another bogus column to make sure the SELECT * hits the table.

EXPLAIN (ANALYZE,buffers) SELECT * FROM foo 
JOIN (SELECT d,i FROM foo WHERE i IN (1,2,4,5,6,7,8,10,15,22,34,75) ORDER BY d DESC LIMIT 100) f USING (d,i) 
ORDER BY d DESC LIMIT 100;

 Limit  (cost=0.85..1281.94 rows=1 width=17) (actual time=0.052..3.618 rows=100 loops=1)
   Buffers: shared hit=453
   ->  Nested Loop  (cost=0.85..1281.94 rows=1 width=17) (actual time=0.050..3.594 rows=100 loops=1)
         Buffers: shared hit=453
         ->  Limit  (cost=0.42..435.44 rows=100 width=8) (actual time=0.037..2.953 rows=100 loops=1)
               Buffers: shared hit=53
               ->  Index Only Scan using foo_d_i on foo foo_1  (cost=0.42..51936.43 rows=11939 width=8) (actual time=0.037..2.935 rows=100 loops=1)
                     Filter: (i = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75}'::integer[]))
                     Rows Removed by Filter: 9010
                     Heap Fetches: 0
                     Buffers: shared hit=53
         ->  Index Scan using foo_d_i on foo  (cost=0.42..8.45 rows=1 width=17) (actual time=0.005..0.005 rows=1 loops=100)
               Index Cond: ((d = foo_1.d) AND (i = foo_1.i))
               Buffers: shared hit=400
 Execution Time: 3.663 ms

Another option is to just add the primary key to the date,license_plate index.

SELECT * FROM foo JOIN (SELECT id FROM foo WHERE i IN (1,2,4,5,6,7,8,10,15,22,34,75) ORDER BY d DESC LIMIT 100) f USING (id) ORDER BY d DESC LIMIT 100;

 Limit  (cost=1357.98..1358.23 rows=100 width=17) (actual time=3.920..3.947 rows=100 loops=1)
   Buffers: shared hit=473
   ->  Sort  (cost=1357.98..1358.23 rows=100 width=17) (actual time=3.919..3.931 rows=100 loops=1)
         Sort Key: foo.d DESC
         Sort Method: quicksort  Memory: 32kB
         Buffers: shared hit=473
         ->  Nested Loop  (cost=0.85..1354.66 rows=100 width=17) (actual time=0.055..3.858 rows=100 loops=1)
               Buffers: shared hit=473
               ->  Limit  (cost=0.42..509.41 rows=100 width=8) (actual time=0.039..3.116 rows=100 loops=1)
                     Buffers: shared hit=73
                     ->  Index Only Scan using foo_d_i_id on foo foo_1  (cost=0.42..60768.43 rows=11939 width=8) (actual time=0.039..3.093 rows=100 loops=1)
                           Filter: (i = ANY ('{1,2,4,5,6,7,8,10,15,22,34,75}'::integer[]))
                           Rows Removed by Filter: 9010
                           Heap Fetches: 0
                           Buffers: shared hit=73
               ->  Index Scan using foo_pkey on foo  (cost=0.42..8.44 rows=1 width=17) (actual time=0.006..0.006 rows=1 loops=100)
                     Index Cond: (id = foo_1.id)
                     Buffers: shared hit=400
Execution Time: 3.972 ms

Edit

After thinking about it... since the LIMIT restricts the output to 100 rows ordered by date desc, wouldn't it be nice if we could get the 100 most recent rows for each license_plate_id, put all that into a top-n sort, and only keep the best 100 for all license_plate_ids? That would avoid reading and throwing away a lot of rows from the index. Even if that's much faster than hitting the table, it will still load up these index pages in RAM and clog up your buffers with stuff you don't actually need to keep in cache. Let's use LATERAL JOIN:

EXPLAIN (ANALYZE,BUFFERS) 
SELECT * FROM foo 
  JOIN (SELECT d,i FROM 
    (VALUES (1),(2),(4),(5),(6),(7),(8),(10),(15),(22),(34),(75)) idlist 
    CROSS JOIN LATERAL 
    (SELECT d,i FROM foo WHERE i=idlist.column1 ORDER BY d DESC LIMIT 100) f2 
    ORDER BY d DESC LIMIT 100
  ) f3 USING (d,i)
  ORDER BY d DESC LIMIT 100;

It's even faster: 2ms, and it uses the index on (license_plate_id,date) instead of the other way around. Also, and this is important, since each subquery in the lateral hits only the index pages that contain rows that will actually be selected, while the previous queries hit much more index pages. So you save on RAM buffers.

If you don't need the index on (date,license_plate_id) and don't want to keep a useless index, that could be interesting since this query doesn't use it. On the other hand, if you need the index on (date,license_plate_id) for something else and want to keep it, then... maybe not.

Please post results for the winning query

I take back the "nonsense", but your index doesn't support the `WHERE` condition (see the `Filter`). If you remove the useless `license_plate_id` from the index definition, I'll upvote. — Laurenz Albe, Feb 01 '21 at 08:43
No. Removing license_plate_id from the index would prevent postgres from doing an index-only scan which is fast because it is mostly sequential IO. It would have to hit the table to fetch license_plate_id for each row, which would result in lots of random IO, which is OP's problem. It's pretty much the same idea vs using CLUSTER, except the index is much smaller than the table and always ordered, so it's better, but limited to columns actually stored in the index of course. — bobflux, Feb 01 '21 at 08:49
You forget that OP's table has a third column, and he is using `SELECT *`. So that wouldn't be an index-only scan. — Laurenz Albe, Feb 01 '21 at 09:12
see edit -- what matters is to only hit the table for rows that will satisfy the WHERE condition, and avoid table IO for rows that don't. I expected postgres to do the right thing, ie use the id column from the index in the WHERE, but it did not. Easy to work around. — bobflux, Feb 01 '21 at 11:27
@bobflux I'm still no decision for this question, worked with it for a week). With all this variants in my case the result is similar - very long query with order by statement. Actually, I have more complicated schema, but with all simplifications, I can't achive a good perfomance. My current Mongo with same data perfoms this query for ms. Postgres, which I learning now, no :( Just "SELECT * ... WHERE s_id=IN(...) ORDER BY...). Without ORDER BY all good. Mongo do everything good in any case. Should I ask another question with more details and compares with Mongo? (sorry for my no native eng) — Alexey, Feb 05 '21 at 13:27
@bobflux I did it, created new question, I hope it is ok - https://stackoverflow.com/questions/66070195/postgresql-and-mongodb-where-in-condtions — Alexey, Feb 05 '21 at 20:51
As a complementary information, PostGreSQL is the slowest RDBMS compare to all other for COUNT queries. [In the performance tests I have made between MS SQL Server and PostGreSQL][1], it is clear that PG runs COUNT queries between 60 and 1500 time slower than SQL Server : [1]: http://mssqlserver.fr/postgresql-vs-microsoft-sql-server-comparison-part-2-count-performances/ — SQLpro, Dec 01 '21 at 12:34
@SQLpro the conclusion is wrong, in this case postgres is slow not because of MVCC but because it fails to optimize DISTINCT in a smart way. I think it uses a large sort, which will be slow. For query 2, you can try "select count(*) from (select prs_prenom from t_personne_prs group by prs_prenom) foo;" and it will use a hash for the GROUP BY which is about 10x faster on your dataset. Most likely MSSQL uses the index to get the counts, which would be a lot faster, because the data is already sorted, so the DISTINCT becomes very simple. Maybe it has more tricks too. — bobflux, Dec 01 '21 at 12:58
@SQLpro quite mysterious how MSSQL can do QRY2 that fast without index, it's as fast as clickhouse... Is it using 48 cores? — bobflux, Dec 01 '21 at 13:32
As an information, from the PG staff... https://wiki.postgresql.org/wiki/Slow_Counting speaks about MVCC slowing COUNT... — SQLpro, Dec 01 '21 at 13:58
@bobflux of course it uses all the cores but does no really count row by row, but only read metadata in the page header that contains the real number of data slots inside a page... PG, due to the MVCC needs to read row by row every pages / every rows to avoid fantom slots. This is a great misconception of the SQL engine that stores row versions including those obsolete. In other RDBMS optimistic locking places the rows copied into another storage space to avoid this behaviour (transaction log : Oracle, tempdb : SQL Server...). — SQLpro, Dec 01 '21 at 14:04
@SQLpro Yes, if it was counting the whole table, MSSQL would look only at the metadata, postgres would have to check all the rows. But I'm talking about QRY2 which is count(distinct column), and without index, it has to read every row to be able to find the distinct values. Unless it stores distinct values in the header, but that would be surprising. So it does read 10M rows and computes the DISTINCT in 79ms, which is really fast. If it uses a hash, that's 8 nanoseconds per hash. If it uses 10 cores, that's still 80 ns/hash, plus overhead... That's very impressive. It JIT compiles, right? — bobflux, Dec 01 '21 at 14:33
Of course it is really difficult to store DISTINCT count, without having a "rupture" in the header associated to the counted value, to know if the next page has the same value or a new one... MS Research does not speaks about that point, but if I had to design such a RDBMS I will do that in index pages... (there is many things that MS Research does not speaks about in terme of optimization, like how do they have a really good estimation of cardinality when you have a LIKE '%abc%' ! (some statistics are hidden!!!) — SQLpro, Dec 09 '21 at 13:01

PostgreSQL slow order

2 Answers2