3

I have table reviews_article (1508 rows). I don't understand, why two queries with only different offset have thus different query plans. Can you explain me, please?

CREATE TABLE "reviews_article" (
    "id" serial NOT NULL PRIMARY KEY,
    "version_id" integer REFERENCES "reversion_version" ("id") DEFERRABLE INITIALLY DEFERRED,
    "published" boolean NOT NULL,
    "created" timestamp with time zone NOT NULL,
);

CREATE INDEX reviews_article_version_id_7cf1d83d68e3a3c6 
    ON reviews_article ( version_id, published );

CREATE INDEX reviews_article_created_desc_index 
    ON reviews_article ( created DESC NULLS FIRST );
EXPLAIN ANALYZE SELECT ••• FROM "reviews_article" 
LEFT OUTER JOIN "reversion_version" ON ("reviews_article"."version_id" = "reversion_version"."id") 
WHERE ("reversion_version"."id" IS NOT NULL AND "reviews_article"."published" = true ) 
ORDER BY "reviews_article"."created" DESC LIMIT 8 OFFSET 304

has plan:

Limit  (cost=882.89..906.12 rows=8 width=1868) (actual time=3.630..3.717 rows=8 loops=1)
  ->  Nested Loop  (cost=0.00..3319.54 rows=1143 width=1868) (actual time=0.063..3.397 rows=312 loops=1)
        ->  Index Scan using reviews_article_created_desc_index on reviews_article  (cost=0.00..1128.75 rows=1254 width=863) (actual time=0.030..0.926 rows=393 loops=1)
              Filter: published
              Rows Removed by Filter: 6
        ->  Index Scan using reversion_version_pkey on reversion_version  (cost=0.00..1.74 rows=1 width=1005) (actual time=0.003..0.003 rows=1 loops=393)
              Index Cond: ((id = reviews_article.version_id) AND (id IS NOT NULL))
Total runtime: 3.769 ms

but same query with OFFSET 312 has plan:

Limit  (cost=919.06..919.08 rows=8 width=1868) (actual time=16.974..16.987 rows=8 loops=1)
  ->  Sort  (cost=918.28..921.14 rows=1143 width=1868) (actual time=16.488..16.720 rows=320 loops=1)
        Sort Key: reviews_article.created
        Sort Method: top-N heapsort  Memory: 1154kB
        ->  Hash Join  (cost=369.75..865.01 rows=1143 width=1868) (actual time=4.895..13.286 rows=1136 loops=1)
              Hash Cond: (reversion_version.id = reviews_article.version_id)
              ->  Seq Scan on reversion_version  (cost=0.00..471.87 rows=3187 width=1005) (actual time=0.006..4.041 rows=3195 loops=1)
                    Filter: (id IS NOT NULL)
              ->  Hash  (cost=354.08..354.08 rows=1254 width=863) (actual time=4.877..4.877 rows=1136 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 1057kB
                    ->  Seq Scan on reviews_article  (cost=0.00..354.08 rows=1254 width=863) (actual time=0.005..2.521 rows=1254 loops=1)
                          Filter: published
                          Rows Removed by Filter: 254
Total runtime: 17.058 ms
Vladislav
  • 1,318
  • 16
  • 15
  • 1
    Note: `WHERE ("reversion_version"."id" IS NOT NULL ` can be avoided by changing the `LEFT join` into an ordinary `join` – wildplasser Apr 10 '13 at 09:17
  • I use django-orm. That's may difficult. – Vladislav Apr 10 '13 at 12:55
  • 1
    You can do `SET enable_seqscan=false` to try to force the first plan for second query and see how it will perform. – Jakub Kania Apr 10 '13 at 13:19
  • The `Filter: (id IS NOT NULL)` in the seqscan is probably caused by your substandard query-generator. (it is still something that the plan-generator *could* detect, because `id` is probably a PK, and thus not nullable, and thus NOT NULL) – wildplasser Apr 10 '13 at 13:22

1 Answers1

1

To traverses an entire table by index scan is worse than doing it with a sequential scan. Somewhere between an offset of 304 and an offset of 312, the planner decided it would be doing more I/O bouncing around by index than biting the bullet with the seq scan.

There are some possibilities. One, as mentioned, is SET enable_seqscan=0;. Another possible solution would be increasing the statistics parameter for this table and re-analyzing. (I assume the table was freshly analyzed before this test.)

By the way, it's not clear that this repeated LIMIT/OFFSET is good for paging through a result set, if that is what you are doing.

Andrew Lazarus
  • 18,205
  • 3
  • 35
  • 53