0

I need to fix a prod DB issue and the clean up script I've is taking really long. I tried couple of things without any luck, following is the script:

DECLARE @ErrorMessage NVARCHAR(4000)
DECLARE @ErrorSeverity INT
DECLARE @ErrorState INT
DECLARE @ErrorProcedure NVARCHAR(50)

BEGIN TRY

    IF OBJECT_ID('tempdb..#SuspectData') IS NOT NULL
    BEGIN
        DROP TABLE #SuspectData 
    END

    CREATE TABLE #SuspectData
      (
          IID INT,
          CID INT,
          PID INT
      )

    INSERT INTO dbo.#SuspectData 
    SELECT DL.IID,DL.CID,IT.PID FROM DL
    INNER JOIN IT ON IT.CID = DL.CID AND IT.IID = DL.IID
    WHERE DL.Suspect = 1


    WHILE (1 = 1)
    BEGIN
        BEGIN TRANSACTION

        UPDATE TOP (5000) TDS
        SET TDS.DTID = 4 
        FROM 
        TDS
        INNER JOIN dbo.#SuspectData SD
        ON TDS.IID = SD.IID AND TDS.PID = SD.PID
        WHERE TDS.DTID <> 4

       IF @@ROWCOUNT = 0
       BEGIN
          COMMIT TRANSACTION
          BREAK
       END

       COMMIT TRANSACTION
    END

    WHILE (1 = 1)
    BEGIN
        BEGIN TRANSACTION

        UPDATE TOP (5000) TDA
        SET TDA.DTID = 4 
        FROM 
        TDA
        INNER JOIN dbo.#SuspectData SD
        ON TDA.IID = SD.IID AND TDA.PID = SD.PID
        WHERE TDA.DTID <> 4

        IF @@ROWCOUNT = 0
        BEGIN
          COMMIT TRANSACTION
          BREAK
        END

       COMMIT TRANSACTION
    END

    DROP TABLE #SuspectData

END TRY

BEGIN CATCH

    SELECT  @ErrorMessage = ERROR_MESSAGE(),
            @ErrorSeverity = ERROR_SEVERITY(),
            @ErrorState = ERROR_STATE(),
            @ErrorProcedure = ERROR_PROCEDURE()

    RAISERROR (@ErrorMessage,@ErrorSeverity,@ErrorState,@ErrorProcedure) ;

END CATCH

I also have following script to update everything at same time but it is also taking really long time like 24 hours or something.

DECLARE @ErrorMessage NVARCHAR(4000)
DECLARE @ErrorSeverity INT
DECLARE @ErrorState INT
DECLARE @ErrorProcedure NVARCHAR(50)

BEGIN TRY

    IF OBJECT_ID('tempdb..#SuspectData') IS NOT NULL
    BEGIN
        DROP TABLE #SuspectData 
    END

    CREATE TABLE #SuspectData
      (
          IID INT,
          CID INT,
          PID INT
      )

    INSERT INTO dbo.#SuspectData 
    SELECT DL.IID,DL.CID,IT.PID FROM DL
    INNER JOIN IT ON IT.CID = DL.CID AND IT.IID = DL.IID
    WHERE DL.Suspect = 1

    BEGIN TRANSACTION

        --Update about 1.5M records
        UPDATE TDS
        SET TDS.DTID = 4 
        FROM 
        TDS
        INNER JOIN dbo.#SuspectData SD
        ON TDS.IID = SD.IID AND TDS.PID = SD.PID
        WHERE TDS.DTID <> 4

    COMMIT TRANSACTION

    BEGIN TRANSACTION

        --Update about 4.5M records
        UPDATE TDA
        SET TDA.DTID = 4 
        FROM 
        TDA
        INNER JOIN dbo.#SuspectData SD
        ON TDA.IID = SD.IID AND TDA.PID = SD.PID
        WHERE TDA.DTID <> 4

    COMMIT TRANSACTION

    DROP TABLE #SuspectData

END TRY

BEGIN CATCH

    SELECT  @ErrorMessage = ERROR_MESSAGE(),
            @ErrorSeverity = ERROR_SEVERITY(),
            @ErrorState = ERROR_STATE(),
            @ErrorProcedure = ERROR_PROCEDURE()

    RAISERROR (@ErrorMessage,@ErrorSeverity,@ErrorState,@ErrorProcedure) ;

END CATCH
alroc
  • 27,574
  • 6
  • 51
  • 97
  • 1
    Your transaction needs to be broken into pieces. The way you coded it is actually slower than the lower script because the transaction is the same size but you slowed it down by using a loop. Does this update need to be an all or nothing thing or can you break into manageable chunks? From what I can see there is no reason to use your own transactions, you don't have a single rollback anywhere. And WHILE 1= 1??? Seriously??? – Sean Lange Sep 25 '14 at 14:11
  • 1
    And index your temp table, that coudl help alot if the data set is large and the table it is being joined to is large. – HLGEM Sep 25 '14 at 14:44

2 Answers2

1

I'm guessing that TDS table is large. In that case you can speed up join operation between your temp table and TDS (ON TDS.IID = SD.IID AND TDS.PID = SD.PID) by creating index on your temporary table:

either primary clustered:

CREATE TABLE #SuspectData
      (
          IID INT,
          CID INT,
          PID INT,
          CONSTRAINT pk_temp PRIMARY KEY(IID, PID)
      )

or not-clustered (if IID-PID pairs are not unique):

CREATE INDEX IDX_Temp_SuspectData ON #SuspectData(IID,PID)

What you can also do is check execution plan of those queries - it will help you locate which operation takes so long. On the side: I'm generally against using cursors if you can avoid it.

semao
  • 1,757
  • 12
  • 12
0

First, is there anything that changes DL.Suspect = 1 to something else? or does your data set just keep getting bigger?

I also agree with Sean Lange, does the update have to be all or nothing?

I would recommend using a cursor. cursors are a great way to break up large transaction to speed up use and reduce table locks.

DECLARE db_cursor CURSOR FOR SELECT DL.IID,DL.CID,IT.PID FROM DL
                            INNER JOIN IT ON IT.CID = DL.CID AND IT.IID = DL.IID
                            WHERE DL.Suspect = 1; 
DECLARE @first INT;
DECLARE @second INT;
DECLARE @third INT;
OPEN db_cursor;
FETCH NEXT FROM db_cursor INTO @first , @second , @third ;
WHILE @@FETCH_STATUS = 0  
BEGIN  
      -- Do your updates one row at a time here
      UPDATE TDS
    SET TDS.DTID = 4 
    FROM TDS
    WHERE TDS.IID = @first AND TDS.PID = @third
    WHERE TDS.DTID <> 4
END;
CLOSE db_cursor;
DEALLOCATE db_cursor;
Community
  • 1
  • 1
bowlturner
  • 1,968
  • 4
  • 23
  • 35