2

I have a question, related to DB theory:

Let's assume that we have table with 3 columns: [PersonID], [PersonName], [PersonAge].

We know that when we have a nonclustered index by one column, SQL Server orders table data in accordance with specified column and build B+ tree from it. And when we need to find the row using such an index, SQL Server scans the B++ tree by comparing one atomic data object (int or string, for example). That is clear, how non-clustered index works and find data when we build it by one column (suppose [PersonName]), but what if we create non-clustered index by 2 columns: [PersonName] and [PersonAge]?

I understand, that during sorting criterion with main importance will be [PersonName] and if several records have it same, than they would be sorted by [PersonAge]. But how physically SQL Server would process the B++ tree, based on this index?

How it will use such tree when it should execute query like

SELECT * 
FROM dbo.Person 
WHERE [PersonName] = 'Bob' AND [PersonAge] = 45

Thanks for explaining.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Alex44
  • 139
  • 2
  • 6

1 Answers1

2

The nonclustered index will be pretty much the same as with just one column - but each index entry in the B++ navigation tree will have two column values (PersonName, PersonAge).

Since both columns are comparable, a clearly defined ordering will be applied - first by PersonName, then by PersonAge - so your navigation tree items would look something like this (if looked at as a flat list):

Alice,42
Alice,57
Andrew,31
Anthony,23
...
...
Bertrand,48
Bob,34
Bob,39
Bob,44 
Bob,45
Bob,58
......
Zachary,19

When you run your query

SELECT * 
FROM dbo.Person 
WHERE [PersonName] = 'Bob' AND [PersonAge] = 45

then SQL Server will navigate the B++ navigation tree - first looking at the PersonName, until it finds all occurrences of Bob, and then it scans all Bob's to find the one you're looking for (or maybe it doesn't find it).

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • Thank you an advance. So, if I understand correct, SQL Server finds ALL the records where [PersonName] is 'Bob'(seek operation) and then it will scan each of those record's and comparing with 45 (Age value)? If it is so, what sense of ordering by [PersonAge] parameter? Which profit it gives for us? – Alex44 Dec 09 '14 at 20:07
  • @Alex44: if you *don't* have the `PersonAge` in the index - then how could SQL Server find the *correct* Bob? It couldn't - it would have to take **all** Bobs it finds, go to the actual data page (in the clustered index leaf level) and inspect the `PersonAge` attribute there. That would results in several very expensive **Key Lookup** operations and thus would significantly hurt performance – marc_s Dec 09 '14 at 21:06