2

Consider the following structures, a header and line table, both of which are partitioned by date:

create table stage.order_header (
  order_id    int   not null,
  order_date  date  not null
);

create table stage.order_line (
  order_id     int   not null,
  order_date   date  not null,
  order_line   int   not null
);

create table stage.order_header_2013 (
  constraint order_header_2013_ck1
    check (order_date >= '2013-01-01' and order_date < '2014-01-01')
) inherits (stage.order_header);

create table stage.order_header_2014 (
  constraint order_header_2014_ck1
    check (order_date >= '2014-01-01' and order_date < '2015-01-01')
) inherits (stage.order_header);

create table stage.order_line_2013 (
  constraint order_line_2013_ck1
    check (order_date >= '2013-01-01' and order_date < '2014-01-01')
) inherits (stage.order_line);

create table stage.order_line_2014 (
  constraint order_line_2014_ck1
    check (order_date >= '2014-01-01' and order_date < '2015-01-01')
) inherits (stage.order_line);

If I look at the explain plan on the following query:

select
  *
from
  stage.order_header h
  join stage.order_line l on
    h.order_id = l.order_id and
    h.order_date = l.order_date
where
  h.order_date = '2014-04-01'

It invokes both check constraints and only physically scans the "2014" partitions.

However, if I use an inequality:

where
  h.order_date > '2014-04-01' and
  h.order_date < '2014-05-01'

The check constaint is invoked on the header, but not on the line, and the query will scan the entire line_2013 table, even though the records cannot exist. My thought was that since order_date is included in the join, then any limits on it in one table would propagate to the joined table, but that doesn't appear to be the case.

If I explicitly to this:

where
  h.order_date > '2014-04-01' and
  h.order_date < '2014-05-01' and
  l.order_date > '2014-04-01' and
  l.order_date < '2014-05-01'

Then everything works as expected.

My question is this: I now know this and can add the extra limitations in the where clause, but my concern is with everyone else using the database that doesn't know to do this. Is there a structural (or other) change I can make that would resolve this? I tried adding foreign key constraints, but that didn't change the plan.

Also, the query itself is really physically scanning the 2013 table. It's not just the explain plan.

EDIT:

I did submit this to bugs, but it appears this behavior is unlikely to change... this is what promted me to see a workaround.

The response to my report was:

If I specifically invoke the range on both the h and l tables, it will work fine, but since the join specifies those fields have to be the same, can that condition be propagated automatically?


No. We currently deduce equality transitively, so the planner is able to extract the constraint l.transaction_date = '2014-03-01' from your query (and then use that to reason about the check constraints on l's children). But there's nothing comparable for inequalities, and it's not clear that adding such logic to the planner would be a net win. It would be more complicated than the equality case and less often useful.

Hambone
  • 15,600
  • 8
  • 46
  • 69
  • BTW: with an FK constraint, order_line.order_date would be redundant. Just saying ... – joop May 09 '14 at 13:29
  • A PK on both tables might be helpful too – Patrick May 09 '14 at 13:34
  • It used to not work with the `=` case either. It then got reported on the pg-hackers or pg-bugs list as an optimization suggestion to improve the query planner. Consider doing the same with this case, as I imagine it's an oversight. – Denis de Bernardy May 09 '14 at 13:39
  • Some quick feedback -- the tables actually do have PKs. These tables are simplified to illustrate this specific issue. Also, the date in the FK isn't actually redundant. The order numbers recycle, so the PK is the order # and the order date combined. I posted an edit with the response from pg-bugs – Hambone May 09 '14 at 14:45

0 Answers0