6

Can someone please help me tune this SQL query?

SELECT  a.BuildingID, a.ApplicantID, a.ACH, a.Address, a.Age, a.AgentID, a.AmenityFee, a.ApartmentID, a.Applied, a.AptStatus, a.BikeLocation, a.BikeRent, a.Children, 
        a.CurrentResidence, a.Email, a.Employer, a.FamilyStatus, a.HCMembers, a.HCPayment, a.Income, a.Industry, a.Name, a.OccupancyTimeframe, a.OnSiteID,
        a.Other, a.ParkingFee, a.Pets, a.PetFee, a.Phone, a.Source, a.StorageLocation, a.StorageRent, a.TenantSigned, a.WasherDryer, a.WasherRent, a.WorkLocation, 
        a.WorkPhone, a.CreationDate, a.CreatedBy, a.LastUpdated, a.UpdatedBy
FROM    dbo.NPapplicants AS a INNER JOIN
        dbo.NPapartments AS apt ON a.BuildingID = apt.BuildingID AND a.ApartmentID = apt.ApartmentID
WHERE   (apt.Offline = 0)
AND     (apt.MA = 'M')

.

Here's what the Execution Plan looks like:

.

enter image description here

What I don't understand is why I'm getting a Index Scan for NPapplicants. I have an Index that covers BuildingID and ApartmentID. Shouldn't that be used?

Axeva
  • 4,697
  • 6
  • 40
  • 64

2 Answers2

9

It is because it is expecting close to 10K records to return from the matches. To go back to the data to retrieve other columns using 10K keys is equivalent to something like the performance of just scanning 100K records (at the very least) and filtering using hash match.

As for access to the other table, the Query Optimizer has decided that your index is useful (probably against Offline or MA) so it is seeking on that index to get the join keys.

These two are then HASH matched for intersections to produce the final output.

RichardTheKiwi
  • 105,798
  • 26
  • 196
  • 262
  • Thanks for the response. So are you saying there's not much I can do. The query is about as fine-tuned as it's going to get with that much data to return? – Axeva Jan 28 '11 at 22:40
  • For that amount of columns from the applicant table - yes, that's as good as it gets. Unless the filter on `apartments(Offline,MA)` is really really selective (like 1% or less) and `count(apartments) < count(applicant)` such that it can produce only a few keys on which to INDEX SEEK on applicant. – RichardTheKiwi Jan 28 '11 at 22:42
4

A seek in a B-Tree index is several times as expensive as a table scan (per record).

Additionally, another seek in the clustered index should be made to retrieve the values of other columns.

If a large portion of records is expected to match, then it is cheaper to scan the clustered index.

To make sure that the optimizer had chosen the best method, you may run this:

SET STATISTICS IO ON
SET STATSTICS TIME ON

SELECT  a.BuildingID, a.ApplicantID, a.ACH, a.Address, a.Age, a.AgentID, a.AmenityFee, a.ApartmentID, a.Applied, a.AptStatus, a.BikeLocation, a.BikeRent, a.Children, 
        a.CurrentResidence, a.Email, a.Employer, a.FamilyStatus, a.HCMembers, a.HCPayment, a.Income, a.Industry, a.Name, a.OccupancyTimeframe, a.OnSiteID,
        a.Other, a.ParkingFee, a.Pets, a.PetFee, a.Phone, a.Source, a.StorageLocation, a.StorageRent, a.TenantSigned, a.WasherDryer, a.WasherRent, a.WorkLocation, 
        a.WorkPhone, a.CreationDate, a.CreatedBy, a.LastUpdated, a.UpdatedBy
FROM    dbo.NPapplicants AS a INNER JOIN
        dbo.NPapartments AS apt ON a.BuildingID = apt.BuildingID AND a.ApartmentID = apt.ApartmentID
WHERE   (apt.Offline = 0)
AND     (apt.MA = 'M')

SELECT  a.BuildingID, a.ApplicantID, a.ACH, a.Address, a.Age, a.AgentID, a.AmenityFee, a.ApartmentID, a.Applied, a.AptStatus, a.BikeLocation, a.BikeRent, a.Children, 
        a.CurrentResidence, a.Email, a.Employer, a.FamilyStatus, a.HCMembers, a.HCPayment, a.Income, a.Industry, a.Name, a.OccupancyTimeframe, a.OnSiteID,
        a.Other, a.ParkingFee, a.Pets, a.PetFee, a.Phone, a.Source, a.StorageLocation, a.StorageRent, a.TenantSigned, a.WasherDryer, a.WasherRent, a.WorkLocation, 
        a.WorkPhone, a.CreationDate, a.CreatedBy, a.LastUpdated, a.UpdatedBy
FROM    dbo.NPapplicants WITH (INDEX (index_name)) AS a
INNER JOIN
        dbo.NPapartments AS apt ON a.BuildingID = apt.BuildingID AND a.ApartmentID = apt.ApartmentID
WHERE   (apt.Offline = 0)
AND     (apt.MA = 'M')

Replace index_name with the actual name of your index and compare the execution times and the numbers of I/O operations (as seen in the messages tab)

Quassnoi
  • 413,100
  • 91
  • 616
  • 614
  • 1
    Interesting. Without the Index: CPU time = 93 ms, elapsed time = 679 ms. /// With the Index: CPU time = 172 ms, elapsed time = 666 ms. – Axeva Jan 28 '11 at 23:07
  • @Axeva: as you can see, the mama optimizer didn't raise no fool. On non-selective fields, an index seek is more expensive, especially with a non-covering index. – Quassnoi Jan 29 '11 at 01:14