I've been struggling with a query performance issue for a few weeks now. At this point I've squeezed absolutely everything out of the query in terms of JOIN types, Indexing, Keeping Statistics up to date, etc... etc... but then I stumbled on something by accident.
A little background.
The table in question represents a Record
Id INT PK
Name NVARCHAR(50)
Status INT FK
Created DATETIME
Version NVARCHAR(10)
Data XML
After some performance benchmarking, I realised that the inclusion of the final column in the select far outweighs things like indexing, join complexity & network considerations by somewhere between a factor 10x & 20x.
The following comparisons were done between SSMS on local dev machine connecting to SQL Azure.
SELECT Id FROM Records -- ~10 secs for 300,000 rows
SELECT Id, Name, Status, Created, Version FROM Records -- ~20 sec for 300,000 rows
SELECT * FROM Records -- ~350 sec for 300,000 rows
To be clear, I'm not doing anything crazy with the xml column (XML DML or XPath queries). Just simply including/excluding it from the select.
At this point, I think I've solved my problem by creating a RecordLight
Entity, NHibernate Map & MVC Controller stack, purely for the purposes of searching & listing in our App.
But I'd like to understand why the inclusion of the XML column is having such a negative effect on Query Performance