I have 2 tables: tbl_Token (~100 million rows) & tbl_EntryTokenSummary (~3.75 billion rows)
I'm trying to understand why the following 2 scenarios result in different 'net' plans. I don't really understand why Scenario 1 even references the non-clustered index, it shouldn't be relevant.
I think it's save to say the performance is a result of the 1418 scans, but why are they done?
/********************************************************************
* Scenario 1
********************************************************************/
Select top 100
*
From (
Select
EntryId, FirstOccurence
From tbl_EntryTokenSummary
Where TokenId = (
Select Id
From tbl_Token
Where TSig = dbo.GetTokenSignature('water') and Text = 'water')
and EntryTypeId in (1,2)
) a
Inner Join (
Select
EntryId, FirstOccurence
From tbl_EntryTokenSummary
Where TokenId = (
Select Id
From tbl_Token
Where TSig = dbo.GetTokenSignature('pipe') and Text = 'pipe')
and EntryTypeId in (1,2)
) b on a.EntryId = b.EntryId
This results in the following messages output:
2018-04-18 22:15:40.5925859
(100 row(s) affected)
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tbl_EntryTokenSummary'. Scan count 1418, logical reads 20419, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tbl_Token'. Scan count 11, logical reads 21, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
2018-04-18 22:15:43.9677926
And this plan: QueryPlan Scenario 1
However if I break out the TokenId lookups, the results change significantly:
/********************************************************************
* Scenario 2
********************************************************************/
Declare @t1 int = (Select Id From tbl_Token Where TSig = dbo.GetTokenSignature('water') and Text = 'water')
print sysdatetime()
Declare @t2 int = (Select Id From tbl_Token Where TSig = dbo.GetTokenSignature('pipe') and Text = 'pipe')
print sysdatetime()
Select top 100
ets1.EntryId, ets1.FirstOccurence, ets2.EntryId, ets2.FirstOccurence
From tbl_EntryTokenSummary ets1
Inner Join tbl_EntryTokenSummary ets2 On ets2.TokenId = @t2 and ets1.EntryId = ets2.EntryId
Where ets1.TokenId = @t1
resulting in much better output:
2018-04-18 22:15:43.9677926
Table 'tbl_Token'. Scan count 6, logical reads 12, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
2018-04-18 22:15:43.9677926
Table 'tbl_Token'. Scan count 5, logical reads 9, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
2018-04-18 22:15:43.9677926
(100 row(s) affected)
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tbl_EntryTokenSummary'. Scan count 2, logical reads 1019, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
2018-04-18 22:15:44.1084285
EDIT:
While reviewing the table set up, I realized I had not set the primary key, only a partitioned clustered index (which was actually unique, but not marked as unique).
Just adding the Primary Key actually made things worse, but once I dropped all indexes and added back the Primary Key (as partitioned/clustered), the query optimizer calculated a more correct plan did the exact same thing.
Original query plan: https://www.brentozar.com/pastetheplan/?id=Hku0KKH3f
Updated query plan (after Primary Key): https://www.brentozar.com/pastetheplan/?id=HJu3kwo2f
Side Note: I was directed to this link as an explanation of that plan that was generated: https://sqlperformance.com/2014/01/sql-plan/starjoininfo-in-execution-plans