0

The following query causes the SQL Server 2014 TEMPDB.MDF to grow to 40Gb and takes about an hour to run.

Similar queries for columns that are not NVARCHAR(MAX) take a few minutes.

The table being updated has approximately 7 million rows.

Adding the AND C.SQLStmt IS NOT NULL to the end improved things to get to this point.

Any help would be appreciated.

UPDATE R 
SET    R.SQLStmt = C.SQLStmt 
FROM SampleResults R 
  JOIN SampleTests T ON T.SampleCode = R.SampleCode 
                    AND T.TestPosition = R.TestPosition 
  JOIN TestComponents C ON T.TestCode = C.TestCode 
                       AND T.TestVersion = C.AuditNumber 
                       AND R.ComponentColumn = C.ComponentColumn 
                       AND R.ComponentRow = C.ComponentRow 
WHERE T.AuditFlag = 0 
AND   R.AuditFlag = 0 
AND   C.SQLStmt IS NOT NULL

This is following on from this question, which I don't have enough reputation to comment on.

The execution plan is:

Update
Cost: 0%
 |
Cluseterd Index Updated
[SampleResults].[pk_SampleResults]
Cost: 27%
 |
Top
COST: 0%
 |
Sort
(Distinct Sort)
Cost: 31%
 |
Nested Loops   Index Seek (NonClustered)
(Inner Join) - [SampleTests].[SampleTestsAuditFlag...
Cost: 0%       Cost: 10%
 |
Nested Loops   Key Lookup (Clustered)
(Inner Join) - [SampleResults].[pk_SampleResults]...
Cost: 0%       Cost: 15%
 |
Nested Loops   Index Seek (NonClustered)
(Inner Join) - [SampleResults].[SampleResultsCompo...
Cost: 0%       Cost: 9%
 |
Filter
Cost: 0%
 |
Clustered Index Scan (Clustered)
[TestComponents].[pk_TestComponents...
Cost: 9%
Jon Mason
  • 3
  • 2
  • `(n)varchar(MAX)` columns don't like being joined on. Personally, I would reconsider your column's scale; if you need to be performing a `JOIN` on it, do you really need to be storing up to 2GB of data in that column and make it "unindexable"? *Edit: Helps if I include the word `MAX`. /facepalm* – Thom A Mar 23 '18 at 17:09
  • Can you include the execution plan? – Xedni Mar 23 '18 at 17:09
  • 1
    Also, if you're updating that many rows, you might consider doing the updates in batches. It won't cause such an overflow to TempDB then, as the volume of data being processed at one time will be smaller. – Thom A Mar 23 '18 at 17:14
  • I'd also batch this instead. You're probably blowing up your transaction log – Xedni Mar 23 '18 at 17:14
  • You could have a MAX(LEN(column )) then create a temp table with that size of NVARCHAR, this would allow for the quick joins and also additionally add some custom indexes. – Holmes IV Mar 23 '18 at 21:46
  • Added execution plan to question Xedni. – Jon Mason Mar 26 '18 at 10:31
  • I changed both source and destination columns from NVARCHAR(MAX) to NVARCHAR(4000) and TEMPDB.MDF is only growing to 22Gb now. In practice I don't think any of our customers would have used more than 4000 characters, but there's always the chance. – Jon Mason Mar 26 '18 at 10:58

1 Answers1

0

Because the maximum size of the actual data stored in the column was only 392 characters, the final solution was:

1) Temporarily alter the size of both the source and destination columns (SampleResults.SQLStmt and TestComponents.SQLStmt) to 392 characters. 2) Do the update. 3) Alter the size of both the source and destination columns to their new more sensible size of 4000 characters.

DECLARE @tempSQLStmtLength SMALLINT = (SELECT MAX(LEN(SQlStmt)) FROM TestComponents WHERE SQlStmt IS NOT NULL)
IF @tempSQLStmtLength IS NOT NULL AND @tempSQLStmtLength > 0
BEGIN
  -- Temporarily reduce size of SQLStmt columns so that update is as quick as possible.
  EXEC('ALTER TABLE SampleResults ALTER COLUMN SQLStmt NVARCHAR(' + @tempSQLStmtLength +') NULL')
  EXEC('ALTER TABLE TestComponents ALTER COLUMN SQLStmt NVARCHAR(' + @tempSQLStmtLength + ') NULL')

  -- Perform update.
  UPDATE R SET R.SQLStmt = C.SQLStmt FROM SampleResults R JOIN SampleTests T ON T.SampleCode = R.SampleCode AND T.TestPosition = R.TestPosition JOIN TestComponents C ON (T.TestCode = C.TestCode OR T.TestCode + '-' + CAST(T.TestPosition AS VARCHAR(5)) + '-' + T.SampleCode = C.TestCode) AND T.TestVersion = C.AuditNumber AND R.ComponentColumn = C.ComponentColumn AND R.ComponentRow = C.ComponentRow WHERE T.AuditFlag = 0 AND R.AuditFlag = 0 AND C.SQLStmt IS NOT NULL

  -- Now that update is finished, set SQLStmt columns to their final new size.
  ALTER TABLE SampleResults ALTER COLUMN SQLStmt NVARCHAR(4000) NULL
  ALTER TABLE TestComponents ALTER COLUMN SQLStmt NVARCHAR(4000) NULL
END;

TEMPDB.MDB only grew to 2.8Gb and took a few minutes.

Jon Mason
  • 3
  • 2