4

When I try and insert some results from an sp_executesql into a variable table, I get really bad perf.

First, the query just as a simple Select.

EXEC sp_executesql N'SELECT      [a].[ListingId]
FROM        [dbo].[Listings] [a] LEFT OUTER JOIN 
            [dbo].[AgencyCompany] [b] ON [a].[AgencyCompanyId] = [b].[AgencyCompanyId]
ORDER BY    UpdatedOn DESC'

This runs in a few seconds as there's a few million results to pass over the wire from a DB in the cloud to my localhost machine. So that's totally kewl.

Query plan:

enter image description here

Now, lets change the query to INSERT INTO the results...

DECLARE @ListingIds TABLE (ListingId INTEGER PRIMARY KEY)
INSERT INTO @ListingIds
EXEC sp_executesql N'SELECT      [a].[ListingId]
FROM        [dbo].[Listings] [a] LEFT OUTER JOIN 
            [dbo].[AgencyCompany] [b] ON [a].[AgencyCompanyId] = [b].[AgencyCompanyId]
ORDER BY    UpdatedOn DESC'

This takes about 45 seconds to return the results to my localhost machine. Same Query (well, same SELECT query).

Lets look at the query plan...

enter image description here

Now lets try this with raw sql...

DECLARE @ListingIds TABLE (ListingId INTEGER PRIMARY KEY)
INSERT INTO @ListingIds
SELECT      [a].[ListingId]
FROM        [dbo].[Listings] [a] LEFT OUTER JOIN 
            [dbo].[AgencyCompany] [b] ON [a].[AgencyCompanyId] = [b].[AgencyCompanyId]
ORDER BY    UpdatedOn DESC

4 seconds to run and the plan..

enter image description here

  • So when I run the query normally, it's ok and fast.
  • When I INSERT INTO it takes 45 odd seconds.
  • Same times when I have params or not.
  • Same times if I use OPTION (RECOMPILE) or not.
  • Why use sp_executesql instead of just the raw sql statement? Because we have a heap of dynamic WHERE / AND statements which starts to make things hard/not-that-perf-nice.

Tech..
- Sql Server 2012

enter image description here

Pure.Krome
  • 84,693
  • 113
  • 396
  • 647

3 Answers3

6

The difference between both statements is that in the raw version:

DECLARE @ListingIds TABLE (ListingId INTEGER PRIMARY KEY)
INSERT INTO @ListingIds
SELECT      [a].[ListingId]
FROM        [dbo].[Listings] [a] LEFT OUTER JOIN 
            [dbo].[AgencyCompany] [b] ON [a].[AgencyCompanyId] = [b].[AgencyCompanyId]
ORDER BY    UpdatedOn DESC

... the results of the select are directly streamed into the insert statement.

But in the dynamic version:

DECLARE @ListingIds TABLE (ListingId INTEGER PRIMARY KEY)
INSERT INTO @ListingIds
EXEC sp_executesql N'SELECT      [a].[ListingId]
FROM        [dbo].[Listings] [a] LEFT OUTER JOIN 
            [dbo].[AgencyCompany] [b] ON [a].[AgencyCompanyId] = [b].[AgencyCompanyId]
ORDER BY    UpdatedOn DESC'

... the results of the EXEC sp_executesql is cumulated into a sort of a temporary table called the Parameter Table (you can see the extra step in your execution plan). Only after this temporary table has been populated with the millions of rows does the insert statement actually start reading the data. This is much slower.

You may be able to get around this performance degradation if you can somehow refactor the code to push the INSERT statement inside the EXEC sp_executesql call. By doing that, the results of the SELECT statement could once again be streamed directly to the INSERT statement.

References: Here is an interesting article that discusses the problem you are facing: The Hidden Costs of INSERT EXEC.

A little aside: you don't need the ORDER BY clause if all you are doing is inserting the data right after.

Community
  • 1
  • 1
sstan
  • 35,425
  • 6
  • 48
  • 66
  • 1
    I ended up using a `#TempTable` and inserting into that inside the sql. The reason i used a `#TempTable` was because i was doing _another_ query (with these temp results) after the `EXEC sp_executesql '..'` – Pure.Krome Aug 10 '16 at 04:30
  • Out of curiosity, what was the performance after you made that change? Was it around the expected 4 seconds? – sstan Aug 10 '16 at 10:32
  • yep. about 6 seconds but i included `FETCH/OFFSET` on it. But I still removed the requirement of the `PARAMETER TABLE`. – Pure.Krome Aug 10 '16 at 12:27
0

This is a bit long for a comment.

You haven't shown the entire execution plan, but the "index scan" suggests what is happening. An index is being used to retrieve the data in order. That means that there is no additional sorting step and the query can start returning data immediately. You are seeing the results as they return.

Evidence? If there are millions of rows, there should be some lag in getting returned results. If the data has to be sorted, the rows have to be read and then ordered.

On the other hand, the insert into has to wait until the all the data has been generated, because it has to insert all the results into a table. It doesn't return until the last row has been read.

EDIT:

I notice another problem. The primary index wants the data sorted in order by the primary key. You are sorting it by another value. No doubt the extra sorts of the data are taking additional time. However, a factor of 10 seems a bit extreme.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Updated post. When i use `INSERT INTO` with the raw sql (like a normal sql query) .. the insert takes 4 seconds. Also .. if I'm missing some data .. I can post it in here to help. *I'm so confused* :( – Pure.Krome Aug 10 '16 at 02:23
0

Regretfully I don't have a machine to elaborate on at this time but I believe the difference is declaring the temporary table with a primary key. In the second query execution plan you can see that the clustered index insert is taking the majority of time.

Using a temporary table without a primary key should achieve an equivalent performance.

Dustin Venegas
  • 760
  • 1
  • 7
  • 16
  • Removed the `PRIMARY KEY` from the variable table => dropped the perf from 45 seconds to 39. I'm expecting similar perf (ie. 4 odd seconds) – Pure.Krome Aug 10 '16 at 02:27