0

So I currently have a an issue where I am encountering duplicates based off a composite key when trying to insert into a table from a materialized view. The composite key is composed of three columns: TIMESTAMP, NAME and TYPE. I've tried using the following method but I still run into the same issue (I get a duplicate error message and the stored procedure stops running). IS there a way to handle the exception so that it just skips over the duplicate row? If not I thought of the below DELETE statement to get rid of duplicates, is this the correct approach?

Below is the 'bulk insert' statement that I am using:

DELETE FROM table
     where timestamp IN (SELECT DISTINCT ts from mv_1)
       and person IN (SELECT distinct name from mv_1)
       and TYPE IN (SELECT distinct data_type from mv_1);
       commit;
--
 DELETE FROM table
     WHERE    timestamp || person || TYPE IN
              (SELECT DISTINCT
                         ts
                      || name
                      || data_type
                 FROM edb.mv_1); 
COMMIT;
--
INSERT INTO table
           (person
            timestamp,
            TYPE)
  SELECT name,
         ts,
         data_type
    FROM edb.mv_1 a
   WHERE  a.row_id NOT IN
                (SELECT row_id
                   FROM (SELECT v.*,
                                ROW_NUMBER ()
                                OVER (
                                   PARTITION BY v.name,
                                                v.data_type,
                                                v.ts
                                   ORDER BY row_id)
                                   AS rn
                           FROM mv_1 v) t
                  WHERE rn > 1)

Note: row_id is a column I made in MV_1 just so I would have a column that would give me a unique identifier.

Thanks in advance!

John Wick
  • 703
  • 10
  • 22
  • Is the problem actually because you have composite key combinations in `mv_1` that already exist in the target table? Or because there are duplicates within `mv_1`? Using `distinct` in your second delete kind of suggests the latter. (Also `distinct` with `in()` is a bit pointless). And if you do have duplicate keys, how are you deciding which to keep? (Also, to actually address your question about skipping the row: yes; [DML error logging](https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/INSERT.html#GUID-903F8043-0254-4EE9-ACC1-CB8AC0AF3423__BGBEIACB).) – Alex Poole Oct 30 '18 at 16:38
  • Thanks for your response @AlexPoole. I want to insert / keep the latest value. The composite key exists in the target table but not the materialized view (MV_1). There are two scenarios where duplicates can occur 1) Duplicates actually in MV_1 2) A record already existing for that particular composite key (if that makes sense). – John Wick Oct 30 '18 at 17:19

1 Answers1

0

I'd make the table safe by adding a composite primary key (which you sure have done already):

ALTER TABLE table ADD CONSTRAINT pk_table PRIMARY KEY (timestamp, person, type);

Furthermore, I'd make the materialized view safe by adding a primary key there, too. Depending on how the mv ist built, you need to change the query that is feeding it, so that no duplicates occur. See for instance here.

If both are safe, the insert is simple:

INSERT INTO table (timestamp, person, type, further_col1, col2, ...)
SELECT timestamp, name, type, further_col1, col2, ....
  FROM edb.mv_1 m
 WHERE NOT EXISTS (
       SELECT * 
         FROM table t2 
        WHERE t2.timestamp = m.timestamp
          AND t2.person    = m.name
          AND t2.type      = m.type
       );

If you have only the three columns, you can shorten the insert to:

INSERT INTO table (timestamp, person, type)
SELECT timestamp, name  , type FROM edb.mv_1  
MINUS
SELECT timestamp, person, type FROM table;  

Please make sure the thee columns are NOT NULL in table and mv. And please don't name tables table and columns timestamp or type. I am sure they are only examples.

wolφi
  • 8,091
  • 2
  • 35
  • 64