I have a table Credit:
create table Credit (ID_Credit int identity, ID_PayRequestStatus int, ... 20 more fields)
create nonclustered index Credit_ix_PayRequestStatus ON dbo.Credit(ID_PayRequestStatus)
The table has about 200k rows. The distribution of ID_PayRequestStatus is as follows:
ID_PayRequestStatus | Number of Values |
---|---|
400 | 198000 |
300 | 1000 |
200 | 490 |
100 | 450 |
999 | 250 |
If I run a query like this:
declare @ID_Status int = 200
select * from Credit where ID_PayRequestStatus = @ID_Status
It uses an index scan of the primary key clustered index and SSMS suggests that I create another index on the same column (ID_PayRequestStatus) but INCLUDE the entire table. Sounds like classic parameter sniffing.
If I run the following query to defeat parameter sniffing (note that the parameter is not what I am optimizing for):
declare @ID_Status int = 200
select * from Credit
where ID_PayRequestStatus = @ID_Status
OPTION (OPTIMIZE FOR (@ID_Status=100))
It uses the index seek on Credit_ix_PayRequestStatus index just as I intended.
My main question is why does comparing against a value, that I didn't optimize for, defeat parameter sniffing?
My secondary question is why does SQL Server use parameter sniffing (e.g. index scan) to begin with? It's a single table, the index is clearly defined, there are no joins and there aren't any secondary criteria in the WHERE statement. I suspect, it's because of the lopsided distribution of ID_PayRequestStatus with 400 taking up 99% of the rows and, thus SQL Server deciding that it's cheaper to do a scan. However, the queries against this table with ID_PayRequestStatus=400
constitute less than 1% of the total. Shouldn't the auto statistics have kicked in by now to resolve this performance problem?