Update Query with NVARCHAR(max) in SQL Server

Question

Have an issue with SQL Server performance and wanted to see if anyone can give some tips about improving the performance of an update query.

What I'm doing is updating one table with data from another table. Here's some of the basics:

SQL Server 2008 R2
Data is pumped to WO table originally from other system (pumped in using datareader and sqlbulkcopy in ADO.NET)
Additional data is pumped to TEMP_REMARKS (pumped in using datareader and sqlbulkcopy in ADO.NET)
Unfortunately, combining the WO and REMARKS in the originating system (via the reader query) is not possible (mainly performance reasons)
Update to WO occurs using value from TEMP_REMARKS where two columns are updated
Note that the column being transferred from TEMP_REMARKS to REMARKS is a nvarchar(max) and is being placed into another nvarchar(max) column (actually two - see query)
WO has 4m+ records
TEMP_REMARKS has 7m+ records

For the join between the two, the following is what is being used:

/* === UPDATE THE DESCRIPTION  */     
UPDATE WO
SET WO_DESCRIPTION = TEMP_REMARKS.REMARKS
FROM WO 
INNER JOIN TEMP_REMARKS ON WO.WO_DESCRIPTION_ID = TEMP_REMARKS.REMARKS_ID;

/* === UPDATE THE FINDINGS  */
UPDATE WO
SET FINDINGS = TEMP_REMARKS.REMARKS
FROM WO 
INNER JOIN TEMP_REMARKS ON WO.FINDINGS_ID = TEMP_REMARKS.REMARKS_ID;

The problem at this point is that the update to the WO table is taking over two hours to complete. I've tried using the MERGE statement with no success. I've got other more completed procedures in the db that don't take nearly as long, so I'm convinced that it is not the configuration of the SQL Server itself.

Is there something that should be done when updating nvarchar(max) columns?

What can be done to improve the performance of this query?

Here are the table definitions:

CREATE TABLE [dbo].[WO](
    [DOCUMENT_ID] [decimal](18, 0) NOT NULL,
    [WO_DESCRIPTION_ID] [decimal](18, 0) NULL,
    [WO_DESCRIPTION] [nvarchar](max) NULL,
    [FINDINGS_ID] [decimal](18, 0) NULL,
    [FINDINGS] [nvarchar](max) NULL,
.... bunch of other fields
 CONSTRAINT [PK_WO] PRIMARY KEY CLUSTERED 
(
    [DOCUMENT_ID] ASC
)

This is the table definition for the TEMP_REMARKS:

CREATE TABLE [dbo].[TEMP_REMARKS](
    [REMARKS_ID] [decimal](18, 0) NOT NULL,
    [REMARKS] [nvarchar](max) NULL
) ON [PRIMARY]

Why is combining WO and REMARKS in the originating systems not possible? Clearly this route temp table route has performance issues. — paparazzo, Oct 17 '12 at 13:28
The same is data going into two different columns. TEMP_REMARKS.REMARKS is going into both WO_DESCRIPTION and FINDINGS. Both based on TEMP_REMARKS.REMARKS_ID. Why is a table with 7M+ records updating a table with 4M+ records? Are the two IDs staggered to not have a collision? — paparazzo, Oct 17 '12 at 14:01
Blam, The originating system REMARKS table has over 100m+ records and when we did join them in the original query and pushed the results through a datareader to the WO table, it took twice as long as the current poorly performing process. I agree that would be ideal, but unfortunately it doesn't. Also, we don't have any control over the originating system to put indexes, change schema, etc... We are dealing with what we are dealt on that system. — mdjtlj, Oct 17 '12 at 14:11
Blam, the FINDINGS_ID and the DESCRIPTION_ID both refer to the REMARKS_ID in the REMARKS table. What we do is pull from the originating system all the REMARKS_ID which are required to fill in these two fields on our side for the WO table. Some records don't have FINDINGS_ID, so that is why the value is not 2 x WO Count. There is no risk of collision as the originating system does not ever use the same REMARKS_ID twice. Why they don't store these remarks and findnigs on the original table are beyond me. — mdjtlj, Oct 17 '12 at 14:14
Blam, What is crazy is that any text that they use in this originating system all goes to the REMARKS table and the various tables link to this. Ideally, the text should have been placed in the original table, especially since they create a new remarks_ID for any and all text, their code doesn't ever link it to multiple records. — mdjtlj, Oct 17 '12 at 14:16
Does not make sense this is a join problem. If the join is able to select records for TEMP why does that same join not preform for going directly to WO? — paparazzo, Oct 17 '12 at 14:21
the join works in the originating system, but when the datareader feeds it back, the total time taken is twice what it is now. It doesn't make sense to me either, but that what the results show and I've got to reduce the overall time. — mdjtlj, Oct 17 '12 at 15:15

score 2 · Answer 1 · answered Oct 17 '12 at 12:54

2

I think, first of all you should consider to create primary key on TEMP_REMARKS, or at least some index on REMARKS_ID

answered Oct 17 '12 at 12:54

Roman Pekar

107,110
28
195
197

Update Query with NVARCHAR(max) in SQL Server

1 Answers1