2

Im a begginer. I know indexes are necessary for performance boosts, but i want to know how they actually work behind the scenes. Beforehand, I used to think that we should make indexes on those columns which are included in where clause (which I realized is wrong)

For example, SELECT * from MARKS where marks_obtained > 50

Consider that there's a clustered index on primary key of this table and I created a non-clustered index on marks_obtained column as its there in my where clause.

My perception: So the leaf nodes will be containing pointers to clustered index and as clustered index points to actual rows, it will select entire rows (due to asteric in my query)

Scenario
I came across following query (from AdventureWorks DB on which a non-clustered index was created) which works fine and took less than a second to execute 3200000 rows until a new column was inserted into it:

Query

SELECT x.* 
INTO#X
FROM dbo.bigProduct AS p
CROSS APPLY
(
SELECT TOP 1000 *
FROM dbo.bigTransactionHistory AS bth
WHERE 
     bth.ProductId = p.bth.ProductId 
ORDER BY
     TransactionDate DESC
) AS x
WHERE 
    p.ProductId BETWEEN 1000 AND 7500
GO

NEW INSERTED COLUMN

ALTER TABLE dbo.bigTransactionHistory 
ADD CustomerId INT NULL

After insertion of above column it took 17 seconds! means 17 times slower. A non-clusered index was now missing CustomerId column in the index. Just after including CustomerId, problem was gone.

Question CustomerId seemed to be the culprit until it was added to the index. BUT HOW???

Sadiq
  • 786
  • 1
  • 10
  • 35
  • You are inserting into temp table so actually **write** operations could take additional time. I want to say that all the operations are not performed "in the air" and each one does have consequences. Prior operation could take that few time because of recent relevant cache contents (how long would it perform after DBCC DROPCLEANBUFFERS?) Prior operation caused writes of 3.2M rows into tempdb. There is a chance that 17s took file enlargement of tempdb. How many page splits happened after column insert? Did you recreate (defragment) clustered index after that? – Ivan Starostin Mar 19 '16 at 15:49
  • It's not only about _"how does index work on select"_. Every action you performed had an effect. Select had effect, as well as insert, column create, and another select-into. And I don't quite understand why you started talking about `MARKS` table, but ran tests on two other tables. – Ivan Starostin Mar 19 '16 at 15:51
  • DBCCDROPCLEANBUFFERS was performed prior to the execution of above Query. Yes, Clusered index was recreated(with CustomerId column including) after new inserted column – Sadiq Mar 19 '16 at 16:39
  • Post both plans as usr mentioned. And may be IO statistics. – Ivan Starostin Mar 19 '16 at 16:40

1 Answers1

0

The execution plan would answer this but I'll make a guess: The non-clustered index was no longer enough to satisfy the query after the additional column had been added. This can cause the index to not be used anymore. It also can cause one clustered index seek per row.

Learn to read execution plans. Turn on the "actual execution plan" feature routinely for each query that you test.

usr
  • 168,620
  • 35
  • 240
  • 369