0

I have a table that has a primary key/clustered index on an ID column and a nonclustered index on a system date column. If I query all the columns from the table using the system date column (covering index wouldn't make sense here) the execution plan shows a key lookup because for each record it finds it has to go the the ID to get all of the column data.

The weird thing is, if I write 2 queries with a temp table it performs much faster. I can query the system date to get a table of ID's and then use that table to search the ID column. This makes sense because you're no longer doing the slow key lookup for each record.

Why doesn't the optimizer do this for us?

--slow version with key lookup
--id primary key/clustered index
--systemdate nonclustered index  

select ID, col1, col2, col3, col4, col5, SystemDate
from MyTable
where SystemDate > '2019-01-01'

--faster version 
--id primary key/clustered index
--systemdate nonclustered index  

select ID, SystemDate
into #myTempTable
from MyTable
where SystemDate > '2019-01-01'

select t1.ID, t1.col1, t1.col2, t1.col3, t1.col4, t1.col5, t1.SystemDate
from MyTable t1
inner join #myTempTable t2
on t1.ID = t2.ID
schulzey
  • 95
  • 1
  • 2
  • 9

2 Answers2

1

Well, in second case you're actually doing a key lookup yourself, aren't you? ; )

Optimizer could perform slower due to outdated (or missing) statistics, fragmented index.

To tell you why it's actually slower, it's best if you'd paste your execution plans here. This would be way easier to explain what happens.

Evaldas Buinauskas
  • 13,739
  • 11
  • 55
  • 107
  • The query plan for the lookup looks like the one posted below in Lukasz's reply. I think this is the problem. A key lookup will do a nested loop join whereas if you separate the queries out like I did above the optimizer performs either a merge/hash join. I think that's the reason why the key lookup performs so much slower, but I could be wrong. – schulzey Aug 26 '19 at 01:09
  • Ye, but could you paste plan where you use temporary table? You could then see whether it does a nested loop or a merge/hash join? But what you're suggesting sounds about right for me. – Evaldas Buinauskas Aug 26 '19 at 05:52
0

Query optimizer chooses key lookup because the query is not supported by covering index. It has to grab missing columns from table itself:

/*
--slow version with key lookup
--id primary key/clustered index
--systemdate nonclustered index 
*/
select ID, col1, col2, col3, col4, col5, SystemDate
from MyTable
where SystemDate > '2019-01-01';

Adding a covering index should boost the performance:

CREATE INDEX my_idx ON MyTable(SystemDate) INCLUDE(col1, col2, col3, col4, col5);

db<>fiddle demo


For query without JOIN:

select ID, col1, col2, col3, col4, col5, SystemDate
from MyTable  -- single table
where SystemDate > '2019-01-01';

There is JOIN in execution plan:

enter image description here

After introducing covering index there is no need for additional key lookup:

enter image description here

Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
  • I understand this. However, if you work in a large enough organization where all kinds of queries are hitting the same tables, putting a covering index on an entire table each time someone grabs something more than just what's in the index isn't viable. What I'm wondering is why writing as two separate queries is faster than using the one query that performs a key lookup. – schulzey Aug 26 '19 at 01:06