I have a very big table with a lot of rows and a lot of columns (I know it's bad but let's leave this aside).
Specifically, I had two columns - FinishTime, JobId
. The first one is the finish time of the row and the second is its id (not unique, but almost unique - only few records exist with the same id).
I have index on jobid and index on finishtime.
We insert rows all the time, mostly ordered by the finish time. We also update statistics of each index periodically.
Now to the problem:
When I run query with filter jobid==<some id> AND finishtime > <now minus 1 hour>
- this query takes a lot of time, and when showing the estimated execution plan I see that the plan is to go over the finishtime
index, even though going over the jobid
index should be a lot better. When looking at the index statistics, I see that the server "thinks" that the number of jobs in the last hour is 1 because we didn't update the statistics of this index.
When I run query with filter jobid==<some id> AND finishtime > <now minus 100 days>
- this works great, because the SQL Server knows to go over the correct index - the job id index.
So basically my question is why if we don't update index statistics all the time (which is time consuming), the server assumes that the number of records past the last bucket is 1?
Thanks very much