0

I have a really slow query and I'm trying to speed it up.

I have a target date range (dateFrom/dateTo) defined in a table with only one row I need to use as a limit against a table with millions of rows. Is there a best practice for this?

I started with one table with one row with dateFrom and dateTo fields. I can limit the rows in the large table by CROSS JOINing it with the small table and using the WHERE clause, like:

select
  count(*)
from
  tblOneRow o, tblBig b
where
  o.dateFrom < b.dateTo and
  o.dateTo >= b.dateFrom

or I can inner join the tables on the date range, like:

select
  count(*)
from
  tblOneRow o inner join 
  tblBig b on
    o.dateFrom < b.dateTo and
    o.dateTo >= b.dateFrom

but I thought if I changed my single-row table to use one field with a PERIOD data type instead of two fields with DATE data types, it could improve the performance. Is this a reasonable assumption? The explain isn't showing a time difference if I change it to:

select
  count(*)
from
  tblOneRow o inner join 
  tblBig b on
    begin(o.date) < b.dateTo and
    end(o.date) >= b.dateFrom

or if I convert the small table's date range to a PERIOD data type and join ON P_INTERSECT, like:

select
  count(*)
from
  tblOneRow o inner join 
  tblBig b on
    o.date p_intersect period(b.dateFrom, b.dateTo + 1) is not null

to help the parsing engine with this join, would I need to define the fields on the large table with a period data type instead of two dates? I can't do that as I don't own that table, but if that's the case, I'll give up on improving performance with this method.

Thanks for your help.

Beth
  • 9,531
  • 1
  • 24
  • 43

1 Answers1

1

I don't expect any difference between the first three Selects, Explain should be the same a product join (the optimizer should expect exactly one row, but as it's duplicated the estimated size should be the number of AMPs in your system). The last Select should be worse, because you apply a calculation (OVERLAPS would be more appropriate, but probably not better).

One way to improve this single row cross join would be a View (select date '...' as dateFrom, date '...' as dateTo) instead of the single row table. This should resolve the dates and result in hard-coded dateFrom/To instead of a product join.

Similar when you switch to Scalar Subqueries:

select
  count(*)
from
  tblBig b
where
  (select min(o.dateFrom) from tblOneRow)  < b.dateTo
 and
  (select min(o.dateTo) from tblOneRow) >= b.dateFrom
dnoeth
  • 59,503
  • 4
  • 39
  • 56
  • thx, Dieter. Not sure how to avoid my full table scan, plus there's 800 rows with dateTo > dateFrom, would need extra code to avoid that condition in a period function. So converting from 2 date fields to a period field won't improve performance? – Beth Nov 14 '16 at 20:40
  • 1
    @Beth: No, a Period is the same as two seperate columns with an additional check: `dateFrom < dateTo` – dnoeth Nov 14 '16 at 20:53