Complex TSQL Merge

Question

I have 'inherited' a brilliant piece of TSQL code that does this:

Loops row-by-row over a cursor.
The cursor contains data that need to be merged (Upserted) in Table A
For each row loop in the cursor a stored proc is called. The proc:
- If a corresponding row exists in Table A then it is updated
- If such a row does not exist then:
  - Inserts a single row in in a different Table B.
  - Fetches the newly generated ID (say its called IDB)
  - Inserts a single row in Table A. Table A insertions need an IDB (the field is not null, it is supposed to have values ONLY from table B, but no FK constraint is in place)

Obviously this sucks (performance & elegance reasons)!!

Question At first this looks like a standard case of MERGE usage. I tried doing:

MERGE [dbo].[TableA] AS Target
USING <cursor data set as a select statement> as Src on target.IDA = Src.IDA
WHEN MATCHED 
  //update
WHEN NOT MATCHED
//insert <------ Fails because obviously a new IDB is required

Also tried various approaches like a nested select that sends IDB on the OUTPUT but it fails because IDB is a PK.

Other kinds of merges also failed eg:

MERGE Table A with <cursor data set as a select statement>
...
MERGE Table A with Table B
WHEN NOT MATCHED
//insert on Table A
WHEN NOT MATCHED
// Update Table B

Does anyone have an idea on this? Essentially I think if we generalise the question would be:

Can I insert and return the PK in one statement that can be nested in other statements

Thanks in advance for any replies

George

You can merge into tableB and use the [technique here](http://stackoverflow.com/q/5365629/73226) I think. — Martin Smith, Sep 21 '11 at 14:00
Thanks, saw that. Its actually a nice idea. Essentially the guy stores newly generated Ids in temp variables. Not the same as what I was looking for (to do the whole process in 1 Merge) but its still an interesting approach. Thanks Martin. — George, Sep 21 '11 at 15:58
What is IDB? Is it possible to change that IDB to a autoincrement, or default value of somesort? Also, can you essentially wrap the act of generating a new ID into a function and then be able to use that in your merge statement? — M.R., Sep 21 '11 at 16:16

score 3 · Accepted Answer · answered Sep 21 '11 at 19:16

If you have an autogenerated PK on TableB, you can use code similar to this. Otherwise, just change the INSERT into TableA to grab the PK from TableB first.

DECLARE @OldData CHAR(10)
SET @OldData = 'Old'
DECLARE @NewData CHAR(10)
SET @NewData = 'New'

CREATE TABLE #TableA 
(
    IDA INT IDENTITY(1,1) PRIMARY KEY,
    IDB INT NOT NULL,
    DataA CHAR(10)
)

CREATE TABLE #TableB 
(
    IDB INT IDENTITY(1,1) PRIMARY KEY,
    DataB CHAR(10)
)

DECLARE @IDsToUpsert TABLE
(
    ID INT
)

-- Add test values for existing rows 
INSERT INTO #TableB
OUTPUT INSERTED.IDB, @OldData
INTO #TableA
SELECT @OldData UNION ALL
SELECT @OldData UNION ALL
SELECT @OldData UNION ALL
SELECT @OldData 

-- Add test values for the rows to upsert
INSERT INTO @IDsToUpsert
SELECT 1 UNION -- exists
SELECT 3 UNION -- exists
SELECT 5 UNION -- does not exist
SELECT 7 UNION -- does not exist
SELECT 9       -- does not exist

-- Data Before
SELECT * From #TableA
SELECT * From #TableB

DECLARE rows_to_update CURSOR
    FOR SELECT ID FROM @IDsToUpsert

DECLARE @rowToUpdate INT
DECLARE @existingIDB INT

OPEN rows_to_update;

FETCH NEXT FROM rows_to_update 
INTO @rowToUpdate;

WHILE @@FETCH_STATUS = 0
BEGIN
    BEGIN TRANSACTION

        IF NOT EXISTS 
        (
            SELECT 1 FROM #TableA WITH (UPDLOCK, ROWLOCK, HOLDLOCK)
            WHERE IDA = @rowToUpdate            
        )
        BEGIN
            -- Insert into B, then insert new val into A
            INSERT INTO #TableB
            OUTPUT INSERTED.IDB, INSERTED.DataB 
            INTO #TableA
            SELECT @NewData
            -- Change code here if PK on TableB is not autogenerated
        END
        ELSE
        BEGIN
            -- Update
            UPDATE #TableA
            SET DataA = @NewData
            WHERE IDA = @rowToUpdate
        END

    COMMIT TRANSACTION

    FETCH NEXT FROM rows_to_update 
    INTO @rowToUpdate;
END

CLOSE rows_to_update;
DEALLOCATE rows_to_update;

SELECT * FROM #TableA
SELECT * FROM #TableB

DROP TABLE #TableA
DROP TABLE #TableB

I think its a very nice answer; this <100point poster thanks you ;-) — George, Sep 22 '11 at 09:54

score 1 · Answer 2 · answered Sep 21 '11 at 16:22

To answer your general question - 'Can I insert and return the PK in one statement that can be nested in other statements' - yes, absolutely. But it depends on the logic behind the creation of your PK. In this case, it seems like to generate a PK, you need to insert into a different table and then grab the newly generated ID from there. This is not very efficient (IMHO) unless there is a very specific reason for doing so. Autoincrements, GUIDs, etc tend to work better as PKs. If you can simplify/change the logic behind this, and you can find a simpler way to accomplish that, so the PK 'CAN' be generated in one statment/function and thus can be used in other statements.

Complex TSQL Merge

2 Answers2