0

as a follow up on my question original question posted here

UPDATE in Batches Does Not End and Remaining Data Does Not Get Updated

If you use the logic below you'll see that update never finishes. Let me know if you have any ideas why...

Table 1

IF OBJECT_ID('tempdb..#Table2') IS NOT NULL
        BEGIN 
            DROP TABLE #Table2;
        END

        CREATE TABLE #Table2 (ID INT);

        DECLARE @Count int = 0;

        WHILE (select count(*) from #Table2) < 10000 BEGIN
            INSERT INTO #Table2 (ID) 
            VALUES (@Count)

            -- Make sure we have a unique id for the test, else we can't identify 10 records
            set @Count = @Count + 1;
        END

Table 2

IF OBJECT_ID('tempdb..#Table1') IS NOT NULL
    BEGIN 
        DROP TABLE #Table1;
    END

    CREATE TABLE #Table1 (ID INT);

    DECLARE @Count int = 0;

    WHILE (select count(*) from #Table1) < 5000 BEGIN
        INSERT INTO #Table1 (ID) 
        VALUES (@Count)

        -- Make sure we have a unique id for the test, else we can't identify 10 records
        set @Count = @Count + 1;
    END


     /****************** UPDATE ********************/

    select count (*) from #Table2 t2 where Exists (select * from #Table1 t1 where t1.ID = t2.ID)
    select count (*) from #Table2 where ID = 0
    select count (*) from #Table1 where ID = 0
       -- While exists an 'un-updated' record continue
    WHILE exists (select 1 from #Table2 t2 where Exists (select * from #Table1 t1 where t1.ID = t2.ID) ) 
    BEGIN

        -- Update any top 10 'un-updated' records
        UPDATE t2
        SET ID = 0
        FROM #Table2 t2
        WHERE ID IN (select top 10 id from #Table2 where Exists (select * from #Table1 t1 where t1.ID = t2.ID) )
    END
Dale K
  • 25,246
  • 15
  • 42
  • 71
Data Engineer
  • 795
  • 16
  • 41

1 Answers1

1

Your UPDATE statement is referencing the wrong instance on #Table2. You want the following:

UPDATE t2 SET
    ID = 0
FROM #Table2 t2
WHERE ID IN (
    SELECT TOP 10 ID
    -- note this alias is t2a, and is what the `exists` needs to reference
    -- not the table being updated (`t2`)
    FROM #Table2 t2a
    WHERE EXISTS (SELECT 1 FROM #Table1 t1 WHERE t1.ID = t2a.ID)
)

Note: For testing ensure that @Count starts from 1 not 0 else you do still end up with an infinite loop.

Dale K
  • 25,246
  • 15
  • 42
  • 71
  • This does not work. I've already fixed the alias. I tried your code as well. Still UPDATE never ends. Plus on top of it the original issue is still persists (only 10 records get update no matter how many iteration of UPDATE already finished) – Data Engineer Mar 23 '20 at 23:02
  • There is another bug I forgot to mention,change `DECLARE @Count int = 0;` to `DECLARE @Count int = 1;` - starting from zero, when you are updating to zero does of course fail. But again I assume that is not a real life scenario. – Dale K Mar 23 '20 at 23:25
  • @enigma6205 any joy? – Dale K Mar 24 '20 at 00:56
  • Not sure how @Count impacts the Update statement if it's not even used in it... If you run this you'll see that I we have 5000 records to update. select count (*) from #Table2 t2 where Exists (select * from #Table1 t1 where t1.ID = t2.ID) – Data Engineer Mar 24 '20 at 01:25
  • @enigma6205 its because you are using it to generate the test id's, starting from zero, when you are then setting them to zero, means this test `select 1 from #Table1 t1 where t1.ID = t2.ID` will always pass, because you start with a zero in #table1 which will always match regardless of any updates. Start it from one and it solves the problem. As I say in real life I doubt you have an id of zero, if you do you'd need a different solution. – Dale K Mar 24 '20 at 01:28
  • Trust me, I had an infinite loop until I changed that. Then it worked. – Dale K Mar 24 '20 at 01:28
  • I see, that makes sense. And as you mentioned in real data set I should not have ID as "0", though I do have NULLs. What about them? Should I just filter them out? usually NULL=NULL returns falls – Data Engineer Mar 24 '20 at 01:43
  • nulls need special treatment, to this update I would add `and id is not null` to avoid them. And then add a second loop to handle the case when `id is null`. – Dale K Mar 24 '20 at 02:54