4

Suppose I have two tables job and batch:

CREATE TABLE batch
(
  batch_id   NUMBER(20) PRIMARY KEY,
  batch_type NUMBER(20),
  [some other values] ...
);

CREATE TABLE job
(
  job_id NUMBER(20) PRIMARY KEY,
  job_batch_id NUMBER(20),
  job_usr_id NUMBER(20),
  job_date DATE,
  [some other values] ...
  CONSTRAINT fk_job_batch
    FOREIGN KEY (job_batch_id) REFERENCES batch(batch_id),
  CONSTRAINT fk_job_usr
    FOREIGN KEY (job_usr_id) REFERENCES client(usr_id)
);

And suppose they each contain a considerable amount of data (many millions of rows). What I want to do is create a Materialized View to reflect, for each usr_id, what the first and last jobs were run for a particular type of batch. For example:

CREATE MATERIALIZED VIEW client_first_last_job
(usr_id, first_job_date, last_job_date)
AS
(
  SELECT
    job_usr_id    AS usr_id,
    MIN(job_date) AS first_job_date,
    MAX(job_date) AS last_job_date
  FROM job, batch
  WHERE job_batch_id=batch_id
    AND batch_type IN (1,3,5,9)
  GROUP BY job_usr_id
);

This is all well and good, but because there are so many records it takes a very long time to build this materialized view (far longer than is acceptable to deal with each time it needs to refresh). My immediate thought would be to use Materialized View Logs to have incremental updates. These are easy enough to create. But when I try to build the MV to use REFRESH FAST ON DEMAND, that gets me a ORA-12015: cannot create a fast refresh materialized view from a complex query error, which from some Googling I am guessing is due to the coexistence of a join and aggregate functions.

Is there another way to do this? Note that de-normalization or other alterations to the parent tables are not feasible.

Mike Carpenter
  • 197
  • 1
  • 2
  • 9
  • I'm curious if a non-atomic complete refresh would refresh fast enough for your needs: exec dbms_mview.refresh('CLIENT_FIRST_LAST_JOB', 'C', atomic_refresh => false); Just in-case you aren't aware of its limitations, a non-atomic refresh has the notable draw back that the table can appear empty during the refresh process (even to other users and sessions). However, it will run much faster since it performs a truncate + insert append; As opposed to a regular complete refresh which performs a delete + insert. – Kris Johnston Feb 28 '17 at 18:00

1 Answers1

3

You can nest your mviews which you can read about from the docs:

CREATE MATERIALIZED VIEW joinmview
(usr_id, job_date)
REFRESH FORCE ON DEMAND
AS
(
  SELECT
    job_usr_id    AS usr_id,
    job_date
  FROM job, batch
  WHERE job_batch_id=batch_id
    AND batch_type IN (1,3,5,9)
);

CREATE MATERIALIZED VIEW LOG ON JOINMVIEW 
  WITH ROWID (usr_id, JOB_DATE) including new values;

CREATE MATERIALIZED VIEW client_first_last_job
(usr_id, first_job_date, last_job_date)
REFRESH FORCE ON DEMAND
AS
(
  SELECT
    usr_id,
    MIN(job_date) AS first_job_date,
    MAX(job_date) AS last_job_date
  FROM joinmview
  GROUP BY usr_id
);

Verify that both mviews can fast refresh:

exec dbms_mview.refresh('JOINMVIEW', 'C');
exec dbms_mview.refresh('JOINMVIEW', 'F');

exec dbms_mview.refresh('CLIENT_FIRST_LAST_JOB', 'C');    
exec dbms_mview.refresh('CLIENT_FIRST_LAST_JOB', 'F');

You can put both mviews into the same refresh group (docs), just be sure to add them in the order of their dependency. In other words, in this example, add JOINMVIEW before you add CLIENT_FIRST_LAST_JOB to the refresh group.

Kris Johnston
  • 718
  • 5
  • 15
  • That `joinmview` is going to have some very steep storage implications, i.e. about 75% the number of rows in the parent table. I don't know that I could reasonably commit to that much. – Mike Carpenter Feb 27 '17 at 22:23
  • @MikeCarpenter Since Streams replication is deprecated as of Oracle 11 or 12, I'm not sure you have another solution outside of Oracle Goldengate, which is an expensive extra cost option (provided you are already using Enterprise Edition). – Kris Johnston Feb 27 '17 at 22:27
  • I understand. I was afraid it would be something like that. – Mike Carpenter Feb 27 '17 at 22:29
  • How big are batch and job tables, together with the resulting mview? And how long does it take to build the mview? – BobC Feb 28 '17 at 04:26
  • @BobC Hundreds of millions of rows. The view in the question takes just over 2 hours to build. I am going to test the views in the answer as soon as I have a chance. – Mike Carpenter Feb 28 '17 at 04:43
  • What sort of hardware do you have; number of CPU cores, observed IO bandwidth? Also, can you share the execution plan for the query to build the mview? – BobC Feb 28 '17 at 04:46