PSQL 9.6 - Why does query plan change every time I run it?

Question

We have a production POSTGRES 9.6 db with about 100M records (LOYALTY) and new table (< few thousands as of now) "INFO"

Base Table (written in django)

class Loyalty(models.Model):
   customer = models.ForeignKey(Customer, db_index=True)
   order = models.ForeignKey(Order, null=True) # i.e. no index!

class Info(models.Model):
    loyalty_adjustment = models.OneToOneField(Loyalty)
    order_number = models.CharField(max_length=50, db_index=True)
    ...

QUERY 1:

explain (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)
SELECT
    *
FROM
    "loyalty"
    LEFT OUTER JOIN "info" ON ("loyalty"."id" = "info"."loyalty_adjustment_id")
WHERE ("info"."order_number" = '21072621527905'
    OR "loyalty"."order_id" = 694781500)
LIMIT 21

This yields a slow query: (full scan)

Limit  (cost=19.23..120.18 rows=21 width=220) (actual time=53814.148..77814.842 rows=1 loops=1)"
  ->  Hash Left Join  (cost=19.23..2858123.35 rows=594498 width=220) (actual time=53814.147..77814.840 rows=1 loops=1)"
        Hash Cond: (loyalty.id = info.info_id)"
        Filter: (((info.order_number)::text = '21072621527905'::text) OR (loyalty.order_id = 694781500))"
        Rows Removed by Filter: 118934642"
        ->  Seq Scan on loyalty  (cost=0.00..2412225.44 rows=118899344 width=50) (actual time=1.001..59578.218 rows=118934643 loops=1)"
        ->  Hash  (cost=14.10..14.10 rows=410 width=170) (actual time=0.508..0.508 rows=4 loops=1)"
              Buckets: 1024  Batches: 1  Memory Usage: 9kB"
              ->  Seq Scan on info  (cost=0.00..14.10 rows=410 width=170) (actual time=0.500..0.500 rows=4 loops=1)"
Planning time: 1.185 ms"
Execution time: 77814.890 ms"

Separating the query into 2 without OR clause makes it much much faster < 1 sec

QUERY 2:

explain (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)
SELECT
    *
FROM
    "loyalty"
    LEFT OUTER JOIN "info" ON ("loyalty"."id" = "info"."loyalty_adjustment_id")
WHERE
    "info"."order_number" = '21072620001657'
LIMIT 21

AND

QUERY 3:

explain (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)
SELECT
    *
FROM
    "loyalty"
    LEFT OUTER JOIN "info" ON ("loyalty"."id" = "info"."loyalty_adjustment_id")
WHERE ("info"."order_number" = '21072620001657'
    AND "loyalty"."order_id" = 4967472)
LIMIT 21

Why is having OR clause makes it much much slower than two separate query with union? Does it have to do anything with having condition on the BOTH table?
Why is it running INDEX scan on QUERY 3? Because LOYALTY table does not have ORDER as its index
QUERY 2, even when it does INDEX scan, why is it not doing anything better than full index scan if the condition does state what the index value is (which is a single order_number?)

It's hard to follow your example because it looks like you have joins between 3 different info table columns (`loyalty`, `loyalty_adjustment_id`, `loyalty_id`). If you don't have an index on the columns being queried or joined on, you'll likely experience bad queries (full scans). I can't say how the analyzer determines what to do, but it may adjust what it does between attempts if past performance doesn't seem great. This question would likely be better for the dba stackexchange forum. — ps2goat, Jul 27 '21 at 15:48
Apologies loyalty_adjustment_id and loyalty_id are the same thing. Kindly consider it as a typo (I was trying to make concise example) LOYALTY is a table. I have edited the question above just now. — zcahfg2, Jul 27 '21 at 16:04
postgres makes use of the indexes it thinks will help, so that's why sometimes you'll see it use an index and sometimes not. If you truly have 100M rows and are getting these times, it doesn't look that bad to me. You'll get more accurate scalability of queries if you add more rows to your test environment-- sub 10 ms is not that bad at all (if truly 100M rows). Execution plans can change as more records are involved. — ps2goat, Jul 27 '21 at 20:03

PSQL 9.6 - Why does query plan change every time I run it?

0 Answers0