0

General Overview: I have an Oracle table 'product' that contains approximately 80 million records and I would like to improve the performance of joins that use this table. In most cases we are interested in a very small subset of records from (table) 'product' with (column) 'valid_until' date (value) 'mm/dd/9999'.

Possible solutions:

  • Partition 'mm/dd/9999' and use partition exchange to quickly load new data.
  • Use an index on 'valid_until' date.

Do you guys have any other possible Oracle solutions or ideas?

Jacek Trociński
  • 882
  • 1
  • 8
  • 23
  • Can you show a typical query against the dim and fact table? – David Aldridge May 11 '15 at 13:37
  • @DavidAldridge Nope, the only info I have is that joins are often done on this table on the very small subset of records with valid_until 'mm/dd/9999'. I am not optimizing their queries. – Jacek Trociński May 11 '15 at 13:41
  • So typically it is a case of "select ... from ... where valid_until = ", not "find the most recent valid_until date for these products and use those to join to the fact table"? – David Aldridge May 11 '15 at 13:45
  • @DavidAldridge from what I understnad yes, most queries are interested in a valid_until date with a year equivalant to 9999 (which just means it's current and not historical data) – Jacek Trociński May 11 '15 at 13:57
  • So when a new version of the product is loaded, the "current" one gets a new date on it, and the newly added one is inserted with a 9999 date? Do you know what proportion of the records in the table have a 9999 date on them? I think that would be critical. – David Aldridge May 11 '15 at 14:01
  • @DavidAldridge yup, and about 1% of the records are 9999. – Jacek Trociński May 11 '15 at 14:05

1 Answers1

1

Based on needing to find 1% of records, I would expect an index to be adequate. It might pay to include the PK of the table as well if the query is just to find that for the current products.

If there is not a need to identify records by other valid_until dates then it might be worth using Oracle's equivalent of a partial index by indexing on:

case value_until
when date '...whatever the date is...'
then valid_until
else null
end

... but that would mean changing the schema or the tool that generates the queries or both.

You might keep an eye on the table's statistics to make sure that the cardinality of the selected rows is subject to a reasonably accurate estimation.

I wouldn't go for a partition-based solution as a first choice, as the overhead of row-migration during the update of the valid_until values would be fairly high, but if an index cannot deliver the query performance then by all means try.

David Aldridge
  • 51,479
  • 8
  • 68
  • 96