Performance issue in retriving data from XMLAttribute using Xquery

Question

I have one history table which has information of transaction record of master table, in this table I have used XML column to store that transaction information. Table structure with data looks as follows,

Table information

In Content XML data are stored as XML like below.

        <Answers>
      <AnswerSet>
        <Answer questionId="ProductCode">S3404</Answer>
        <Answer questionId="ProductName">Parabolic Triple</Answer>
        <Answer questionId="LegacyOptionID" selectedvalue="1389">1389</Answer>
        <Answer questionId="LegacyContentID" selectedvalue="624">624</Answer>
        <Answer questionId="LegacyPageID" selectedvalue="355">355</Answer>
        <Answer questionId="LegacyParentID" selectedvalue="760">760</Answer>
     </AnswerSet>
    </Answers>

In all rows structure is same but data is different in answer node, I want to get data which has ProductCode="S3404" and CreatedDate is New.

I have created query like

select n2.* from nodehistory n2 CROSS APPLY 
n2.content.nodes('Answers/AnswerSet') T(c) WHERE 
c.value('./Answer[@questionId="ProductCode"][1]','varchar(100)') ='J154'

ProductCode has unique data for every nodeid, but this is returning more than one row for same nodeid because this is transaction table so same XML can be store multiple time, for this require condition like order by Createddate desc, but execution of this query is taking more time due to XML processing I think.

Can we do like first get

Select Top 1 nodeid from NodeHistory order by CreatedDate desc

then search for XML part.

Any ideas on the more suitable views for better performance?

score 0 · Answer 1 · answered Sep 10 '12 at 13:59

If you're not doing much else with the XML data then .exist should be more efficient than .value. I think there's a note in BOL regarding this. You could also use sql:variable to make this more generic, eg something like:

declare @produceCode varchar(20) = 'S3404'

select n2.* 
from nodehistory n2 
    inner join ( select max(id) id from #nodehistory group by nodeId ) maxId ON n2.id = maxId.id
where n2.content.exist('Answers/AnswerSet/Answer[@questionId="ProductCode"][.=sql:variable("@produceCode")]') = 1

I've used a subquery to limit the resultset to the max(id) per nodeId. Your requirement might be slightly different but you get the idea.

In terms of performance, XML indexes can transform SQL / XML queries but at a cost. For storage you will need between 2-5 times the size of the original table so you'll have to weigh it up with your data. If you do decide to go with XML indexes, then a PROPERTY index should help this type of query, eg

-- create the primary XML index
CREATE PRIMARY XML INDEX xmlidx_nodehistory ON nodehistory(content)
GO

CREATE XML INDEX xmlprpidx_nodehistory ON nodehistory(content)
USING XML INDEX xmlidx_nodehistory FOR PROPERTY
go

declare @produceCode varchar(20) = 'S3404'

select n2.* 
from nodehistory n2 
    inner join ( select max(id) id from nodehistory group by nodeId ) maxId ON n2.id = maxId.id
where n2.content.exist('Answers/AnswerSet/Answer[@questionId="ProductCode"][.=sql:variable("@produceCode")]') = 1

See these great articles for more SQL XML performance tuning ideas:

Performance Optimizations for the XML Data Type in SQL Server 2005

http://msdn.microsoft.com/en-us/library/ms345118.aspx

XML Indexes in SQL Server 2005

http://msdn.microsoft.com/en-us/library/ms345121(SQL.90).aspx

Performance issue in retriving data from XMLAttribute using Xquery

1 Answers1