I have 3 versions of a query that ultimately returns the same results.
One of them gets dramatically slower when adding an extra inner join to a relatively small table AND where parameter variables are used within the where clause.
The execution plan is very different for the fast and slow queries (included below each query).
I want to understand why this happened and how to prevent it.
This query takes < 1 second. It does not have an extra inner join but it uses parameter variables in the where clause.
declare @start datetime = '20120115'
declare @end datetime = '20120116'
select distinct sups.campaignid
from tblSupporterMainDetails sups
inner join tblCallLogs calls on sups.supporterid = calls.supporterid
where calls.callEnd between @start and @end
|--Parallelism(Gather Streams)
|--Sort(DISTINCT ORDER BY:([sups].[campaignID] ASC))
|--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([sups].[campaignID]))
|--Hash Match(Partial Aggregate, HASH:([sups].[campaignID]))
|--Hash Match(Inner Join, HASH:([calls].[supporterID])=([sups].[supporterID]))
|--Bitmap(HASH:([calls].[supporterID]), DEFINE:([Bitmap1004]))
| |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([calls].[supporterID]))
| |--Index Seek(OBJECT:([GOGEN].[dbo].[tblCallLogs].[IX_tblCallLogs_callend_supporterid] AS [calls]), SEEK:([calls].[callEnd] >= '2012-01-15 00:00:00.000' AND [calls].[callEnd] <= '2012-01-16 00:00:00.000') ORDERED FORWARD)
|--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([sups].[supporterID]))
|--Index Scan(OBJECT:([GOGEN].[dbo].[tblSupporterMainDetails].[AUTOGEN_IX_tblSupporterMainDetails_campaignID] AS [sups]), WHERE:(PROBE([Bitmap1004],[GOGEN].[dbo].[tblSupporterMainDetails].[supporterID] as [sups].[supporterID],N'[IN ROW]')))
This query takes < 1 second. It has an extra inner join BUT uses parameter constants in the where clause.
select distinct camps.campaignid
from tblCampaigns camps
inner join tblSupporterMainDetails sups on camps.campaignid = sups.campaignid
inner join tblCallLogs calls on sups.supporterid = calls.supporterid
where calls.callEnd between '20120115' and '20120116'
|--Parallelism(Gather Streams)
|--Hash Match(Right Semi Join, HASH:([sups].[campaignID])=([camps].[campaignID]))
|--Bitmap(HASH:([sups].[campaignID]), DEFINE:([Bitmap1007]))
| |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([sups].[campaignID]))
| |--Hash Match(Partial Aggregate, HASH:([sups].[campaignID]))
| |--Hash Match(Inner Join, HASH:([calls].[supporterID])=([sups].[supporterID]))
| |--Bitmap(HASH:([calls].[supporterID]), DEFINE:([Bitmap1006]))
| | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([calls].[supporterID]))
| | |--Index Seek(OBJECT:([GOGEN].[dbo].[tblCallLogs].[IX_tblCallLogs_callend_supporterid] AS [calls]), SEEK:([calls].[callEnd] >= '2012-01-15 00:00:00.000' AND [calls].[callEnd] <= '2012-01-16 00:00:00.000') ORDERED FORWARD)
| |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([sups].[supporterID]))
| |--Index Scan(OBJECT:([GOGEN].[dbo].[tblSupporterMainDetails].[AUTOGEN_IX_tblSupporterMainDetails_campaignID] AS [sups]), WHERE:(PROBE([Bitmap1006],[GOGEN].[dbo].[tblSupporterMainDetails].[supporterID] as [sups].[supporterID],N'[IN ROW]')))
|--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([camps].[campaignID]))
|--Index Scan(OBJECT:([GOGEN].[dbo].[tblCampaigns].[IX_tblCampaigns_isActive] AS [camps]), WHERE:(PROBE([Bitmap1007],[GOGEN].[dbo].[tblCampaigns].[campaignID] as [camps].[campaignID],N'[IN ROW]')))
This query takes 2 minutes. It has an extra inner join AND it uses parameter variables in the where clause.
declare @start datetime = '20120115'
declare @end datetime = '20120116'
select distinct camps.campaignid
from tblCampaigns camps
inner join tblSupporterMainDetails sups on camps.campaignid = sups.campaignid
inner join tblCallLogs calls on sups.supporterid = calls.supporterid
where calls.callEnd between @start and @end
|--Nested Loops(Inner Join, OUTER REFERENCES:([camps].[campaignID]))
|--Index Scan(OBJECT:([GOGEN].[dbo].[tblCampaigns].[IX_tblCampaigns_isActive] AS [camps]))
|--Top(TOP EXPRESSION:((1)))
|--Nested Loops(Inner Join, OUTER REFERENCES:([calls].[callID], [Expr1007]) OPTIMIZED WITH UNORDERED PREFETCH)
|--Nested Loops(Inner Join, OUTER REFERENCES:([sups].[supporterID], [Expr1006]) WITH UNORDERED PREFETCH)
| |--Index Seek(OBJECT:([GOGEN].[dbo].[tblSupporterMainDetails].[AUTOGEN_IX_tblSupporterMainDetails_campaignID] AS [sups]), SEEK:([sups].[campaignID]=[GOGEN].[dbo].[tblCampaigns].[campaignID] as [camps].[campaignID]) ORDERED FORWARD)
| |--Index Seek(OBJECT:([GOGEN].[dbo].[tblCallLogs].[IX_tblCallLogs_supporterID_closingCall] AS [calls]), SEEK:([calls].[supporterID]=[GOGEN].[dbo].[tblSupporterMainDetails].[supporterID] as [sups].[supporterID]) ORDERED FORWARD)
|--Clustered Index Seek(OBJECT:([GOGEN].[dbo].[tblCallLogs].[AUTOGEN_PK_tblCallLogs] AS [calls]), SEEK:([calls].[callID]=[GOGEN].[dbo].[tblCallLogs].[callID] as [calls].[callID]), WHERE:([GOGEN].[dbo].[tblCallLogs].[callEnd] as [calls].[callEnd]>=[@s2] AND [GOGEN].[dbo].[tblCallLogs].[callEnd] as [calls].[callEnd]<=[@e2]) LOOKUP ORDERED FORWARD)
Notes:
- I believe that the slowness is being caused by the Clustered Index Seek on
tblCallLogs
, however I don't know why SQL Server would choose this execution plan. - Should I be using a query optimiser hint? I have needed to and am reluctant to tell SQL Server how to do its job...
- The problem seems to be caused by a combination of factors - an extra join AND variables.
- Could the execution plan be trying to reuse a 'bad' plan when it finds variables for the query?
- In real life I will have to use parameter variables. Constants are no good! So this problem could exist in many of my queries/stored procedures!
- I have rebuilt indexes and updated statistics on
tblCampaigns
andtblSupporterMainDetails
. This had no effect. - Both tables have clustered indexes on the primary key (identity integer).
- The foreign key column
campaignid
is indexed. - All queries use the same parameter values - weather it be used as a variable or a constant.
Number of records in tables:
- tblSupporterMainDetails = 12,561,900
- tblCallLogs = 27,242,224
- tblCampaigns = 756
UPDATE:
- I have also rebuilt indexes and updated statistics on
tblcalllogs
. No effect. - I have cleared the execution plan cache using
DBCC FREEPROCCACHE
- tblCallLogs.callEnd is a datetime.
Schemas of involved columns:
tblCampaign.campaignid int not null
tblSupporterMainDetails.campaignid int not null
tblSupporterMainDetails.supporterid int not null
tblCallLogs.supporterid int not null
tblCallLogs.callEnd datetime not null
Indexes:
UPDATE 2:
After adding index to tblCallLogs.supporterId - with include column: callEnd
The 'slow' query speeded up to 40 seconds.
the updated execution plan:
|--Nested Loops(Inner Join, OUTER REFERENCES:([camps].[campaignID]))
|--Index Scan(OBJECT:([GOGEN].[dbo].[tblCampaigns].[IX_tblCampaigns_isActive] AS [camps]))
|--Top(TOP EXPRESSION:((1)))
|--Nested Loops(Inner Join, OUTER REFERENCES:([sups].[supporterID], [Expr1006]) WITH UNORDERED PREFETCH)
|--Index Seek(OBJECT:([GOGEN].[dbo].[tblSupporterMainDetails].[AUTOGEN_IX_tblSupporterMainDetails_campaignID] AS [sups]), SEEK:([sups].[campaignID]=[GOGEN].[dbo].[tblCampaigns].[campaignID] as [camps].[campaignID]) ORDERED FORWARD)
|--Index Seek(OBJECT:([GOGEN].[dbo].[tblCallLogs].[IX_tblCallLogs_supporterid_callend] AS [calls]), SEEK:([calls].[supporterID]=[GOGEN].[dbo].[tblSupporterMainDetails].[supporterID] as [sups].[supporterID]), WHERE:([GOGEN].[dbo].[tblCallLogs].[callEnd] as [calls].[callEnd]>=[@s2] AND [GOGEN].[dbo].[tblCallLogs].[callEnd] as [calls].[callEnd]<=[@e2]) ORDERED FORWARD)
SOLUTION:
The extra join was not actually causing the problem directly, however it obviously changed the statement so that sql server held a different execution plan for it.
By adding
OPTION(RECOMPILE)
to the end of the slow statement, I was able get the expected fast performance. i.e. < 1 second. I am still unsure exactly this solution worked - why didnt flushing all plan work? is this a classic case of parameter sniffing? I will update this post as I learn the exact answer - or until someone can give a clear answer.
Thanks to both @LievenKeersmaekers and @JNK for helping so far...