0

I have a dimensional model with a large fact table (millions of rows) which is range partitioned by date and smaller dimensional tables that are not partitioned. I came across materialized views which is often used in these scenarios to improve query performance.

Now, I want to know which way is better of the following two to utilize these materialized views to get aggregated reports.

A. Create one with the by joining the whole fact table with each of the dimension tables required.

create materialized view my_mview execute immediate query rewrite
    select 
       fact.col1, dim1.col2, dim2.col3, sum(fact.col4)
    from 
       my_fact fact 
    inner join
      my_dim1 dim1
       on fact.dim1_key = dim1.dim1_key
    inner join 
      my_dim2 dim2
       on fact.dim2_key = dim2.dim2_key group by fact.col1, dim1.col2, dim2.col3

This seems like the most basic way of using them. But it seems rather limiting and I would require a new materialzed view for each variation of the query I want to create.

B. Create it over the aggregation of the fact table and utilize the query rewrite when doing a dimensional join back.

create materialized view my_mview execute immediate query rewrite
    select 
       col1, dim1.dim2_key, dim2.dim_key, sum(fact.col4)
    from 
       my_fact fact 

And do the join as above in case A, which will use this aggregated materialzed view for the join and not the whole fact table.

Can anyone tell me when I would use each case or the other?

sfactor
  • 12,592
  • 32
  • 102
  • 152

1 Answers1

0

Your first example works exactly as you described.

For the second example the query should be:

create materialized view my_mview execute immediate query rewrite
    select 
       col1, fact.dim2_key, fact.dim_key, sum(fact.col4)
    from 
       my_fact fact
    group by
       col1, fact.dim2_key, fact.dim_key

This will automatically speed up aggregates such as

select sum(fact.col4) 
  from fact

select fact.dim_key,sum(fact.col4) 
  from fact
  group by fact.dim_key

select fact.dim2_key,sum(fact.col4) 
  from fact
  group by fact.dim2_key

I don't think Oracle will rewrite your first type of query to this MV automatically because in the MV the join columns are already grouped by (They also should be grouped in your second example). It never happened for us. This however may also depend on if there are relationships defined between dim and fact table and the value of QUERY_REWRITE_INTEGRITY parameter, so there is still some room for testing here.

You may still get a performance gain by writing a query in a specific way

WITH preaggr as (
    select 
       col1, fact.dim2_key, fact.dim_key, sum(fact.col4)
    from 
       my_fact fact
    group by
       col1, fact.dim2_key, fact.dim_key
)

select
  dim2.col1,
  sum(preaggr.col4)
from
  preaggr
join
  dim2
on
  preaggr.dim2_key = fact.dim2_key
group by  
  dim2.col1
ms32035
  • 159
  • 4