How to ignore duplicate keys when extracting data using OPENQUERY while joining two tables?

Question

I am trying to insert records into MySQL database from a MS SQL Server using the "OPENQUERY" but what I am trying to do is ignore the duplicate keys messages. so when the query run into a duplicate then ignore it and keep going.

What ideas can I do to ignore the duplicates?

Here is what I am doing:

pulling records from MySQL using "OpenQuery" to define MySQL "A.record_id"
Joining those records to records in MS SQL Server "with a specific criteria and not direct id" from here I find a new related "B.new_id" record identifier in SQL Server.
I want to insert the found results into a new table in MySQL like so A.record_id, B.new_id Here in the new table I have A.record_id set as a primary key for that table.

The problem is that when joining table A to Table B some times I find 2+ records into table B matching the criteria that I am looking for which causes the value A.record_id to 2+ times in my data set before inserting that into table A which causes the problem. Note I can use aggregate function to eliminate the records.

@MartinSmith I removed the confusing part form my question. I tried to using the IGNORE_DUP_KEY but I get a syntax error. I don't know if I can use it with openquery. here is my use case INSERT IGNORE_DUP_KEY OPENQUERY — Jaylen, Jan 19 '14 at 18:38
`IGNORE_DUP_KEY` is a property of an index. Not a key word that you use in a query. It is not relevant unless you are inserting to SQL Server hence request for clarification on direction of transfer. — Martin Smith, Jan 19 '14 at 18:46

Gordon Linoff · Answer 1 · 2016-07-07T12:00:49.623

15

I don't think there is a specific option. But it is easy enough to do:

insert into oldtable(. . .)
    select . . .
    from newtable
    where not exists (select 1 from oldtable where oldtable.id = newtable.id)

If there is more than one set of unique keys, you can add additional not exists statements.

EDIT:

For the revised problem:

insert into oldtable(. . .)
    select . . .
    from (select nt.*, row_number() over (partition by id order by (select null)) as seqnum
          from newtable nt
         ) nt
    where seqnum = 1 and
          not exists (select 1 from oldtable where oldtable.id = nt.id);

The row_number() function assigns a sequential number to each row within a group of rows. The group is defined by the partition by statement. The numbers start at 1 and increment from there. The order by clause says that you don't care about the order. Exactly one row with each id will have a value of 1. Duplicate rows will have a value larger than one. The seqnum = 1 chooses exactly one row per id.

edited Jul 07 '16 at 12:00

answered Jan 19 '14 at 18:28

Gordon Linoff

1,242,037
58
646
786

thanks for that note. The problem is that with in my select I may be generating duplicates and when inserting 2 records with the same unique value I get an error because originally the records do not exists but when trying to insert the value 2 different times then i encounter the error. – Jaylen Jan 19 '14 at 18:33
Can you please explain the second query? I never used over (partition by ..... I don't know what it is and what it does. I need to know how can I apply it to my query.. – Jaylen Jan 19 '14 at 18:42
This is still now working for me. I re-stated my question to explain the problem that I am running into. please review that. I think your answer is on the right track – Jaylen Jan 19 '14 at 19:13
1

I think the row_number() over..... statment is not working as you are expecting because I see 2 rows with the same id and have a value 1 as – Jaylen Jan 19 '14 at 21:01
@Mike . . . That seems highly unlikely. I would suggest that if you are really seeing this, ask another question with the sample data and query you are using. – Gordon Linoff Jan 19 '14 at 21:11
4

"it is easy enough to do" -- except that isn't at all as easy as INSERT IGNORE. It is actually quite obnoxious. – Smack Jack Jul 07 '16 at 04:47

score 14 · Answer 2 · answered Jan 19 '14 at 18:33

14

If you are on SQL Server 2008+, you can use MERGE to do an INSERT if row does not exist, or an UPDATE.

Example:

MERGE
INTO    dataValue dv
USING   tmp_holding_DataValue t
ON      t.dateStamp = dv.dateStamp
        AND t.itemId = dv.itemId
WHEN NOT MATCHED THEN
INSERT  (dateStamp, itemId, value)
VALUES  (dateStamp, itemId, value)

answered Jan 19 '14 at 18:33

Barb C. Goldstein

454
1
3
13

1

It looks like he is trying to insert records to MySql from MS Sql with OPENQUERY. I don't think merge is going to help at all in this case. – Andrew Savinykh Jan 19 '14 at 18:57
3

Question and Title are confusing, and opposite. I looked at title where @Mike asks for `INGORE INSERT` equivalent of MySQL, in SQL Server. Whereas Question is saying opposite, i.e. inserting into MySQL from SQL Server. – Barb C. Goldstein Jan 19 '14 at 19:00

How to ignore duplicate keys when extracting data using OPENQUERY while joining two tables?

2 Answers2

Linked