1

I found that Postgres is not using an index for a range query on a partitioned table.

The parent table and its partitions have their date column indexed using btree.

A query like this:

select * from parent_table where date >= '2015-07-01';

does not use indexes.

EXPLAIN result:

Append  (cost=0.00..106557.52 rows=3263963 width=128)
->  Seq Scan on parent_table  (cost=0.00..0.00 rows=1 width=640)
    Filter: (date >= '2015-07-01'::date)
->  Seq Scan on z_partition_2015_07  (cost=0.00..106546.02 rows=3263922 width=128)
    Filter: (date >= '2015-07-01'::date)
->  Seq Scan on z_partition_2015_08  (cost=0.00..11.50 rows=40 width=640)
    Filter: (date >= '2015-07-01'::date)

But a query like this:

select * from parent_table where date = '2015-07-01'

uses an index.

EXPLAIN result:

    Append  (cost=0.00..30400.95 rows=107602 width=128)
->  Seq Scan on parent_table  (cost=0.00..0.00 rows=1 width=640)
    Filter: (date = '2015-07-01'::date)
->  Index Scan using z_partition_2015_07_date on z_partition_2015_07  (cost=0.43..30400.95 rows=107601 width=128)
    Index Cond: (date = '2015-07-01'::date)

When I run the query on a different normal table with date indexed, both queries use the index.

Anything particular that we should do on partitioned table index?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
hudarsono
  • 389
  • 4
  • 19
  • try use date between condition – Adam Silenko Apr 06 '16 at 09:07
  • Show the explain analyze plans for both of them. – Jakub Kania Apr 06 '16 at 09:10
  • Hard to say without knowing about size of your data and its structure. First of all try `VACUUM ANALYZE parent_table` to collect statistics on your partitioned table. If it didn't help try in psql `SET enable_seqscan = off` and repeat your query. Planner should use index scan so you could compare costs of seqscan and indexscan. Most likely seqscan is simply cheaper for this type of query. Indexscan is not very good at fetching large amounts of data. – Ildar Musin Apr 06 '16 at 09:17
  • @AdamSilenko Still the same with between – hudarsono Apr 06 '16 at 09:44
  • @Musin Yes, the partition contains about 3 million rows, and the query will pull about 10% of it. – hudarsono Apr 06 '16 at 09:52
  • probably answer of [this question](http://stackoverflow.com/questions/1007523/how-can-i-use-an-index-on-a-partitioned-table-in-postgresql-8-3-7) will be helpful (looks similar to your problem) – Adam Silenko Apr 06 '16 at 10:03
  • Run `SET enable_seqscan = off` and then `explain select * from parent_table where date >= '2015-07-01'`. Does it cost more than seqscan? – Ildar Musin Apr 06 '16 at 10:44

2 Answers2

3

I assume you are aware that "partitions" are separate tables in Postgres. Indexes are typically not used when retrieving large parts of a table (more than ~ 5 %, it depends on many details), because it's typically faster to just scan the table sequentially in such cases.

What's more, it seems like you select all rows from the involved partitions in your first query. No use for indexes ...

Generally, an equality predicate with = is more selective than a predicate with >=.Think about it:

Your first query with date >= '2015-07-01' retrieves all rows from the partition (guessing, I would need to see the exact definition). Using the index would just add overhead cost. But your second query with date = '2015-07-01' only fetches a small percentage. Postgres expects an index scan to be faster.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
1

maybe that's just faster that way? run your query, then do this:

SET enable_seqscan=false

And run it again.

murison
  • 3,640
  • 2
  • 23
  • 36