PostgreSQL: Statistics on partial index?

Question

PostgreSQL version: 9.3.13

Consider the following tables, index and data:

CREATE TABLE orders (
 order_id bigint,
 status smallint,
 owner int,
 CONSTRAINT orders_pkey PRIMARY KEY (order_id)
)

CREATE INDEX owner_index ON orders
  USING btree
 (owner) WHERE status > 0;

CREATE TABLE orders_appendix (
  order_id bigint,
  note text
)

Data

orders:

(IDs, 0, 1337) * 1000000 rows
(IDs, 10, 1337) * 1000 rows
(IDs, 10, 777) * 1000 rows

orders_appendix:

one row for each order

My problem is:

select * from orders where owner=1337 and status>0

The query planner estimated the number of row to be 1000000, but actual number of row is 1000.

In a more complicated following query:

SELECT note FROM orders JOIN orders_appendix using (order_id)
WHERE owner=1337 AND status>0

Instead of using inner join (which is preferable for small number of rows), it picks bitmap join + a full table scan on orders_appendix, which is very slow.

If the condition is "owner=777", it will choose the preferable inner join instead.

I believe it is bacause of the statistics, as AFAIK postgres can only collect and consider stats for each column independently.

However, if I...

CREATE INDEX onwer_abs ON orders (abs(owner)) where status>0;

Now, a slightly changed query...

SELECT note FROM orders JOIN rders_appendix using (order_id)
WHERE abs(owner)=1337 AND status>0

will results in the inner join that I wanted.

Is there a better solution? Perhaps "statistics on partial index"?

As far as I know statistics on partial indexes are not collected. There is a patch for that, but I don't if that made it into 9.6 or if that will be part of next year's 10.0. but to answer the question we need the output of `explain (analyze, verbose)` for both queries. _Formatted_ text please, [no screen shots](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question/285557#285557) — , Jul 05 '16 at 17:02

PostgreSQL: Statistics on partial index?

0 Answers0