We will need more information on your table examples of your queries to address this fully, but:
- DateTime columns should be highly selective by themselves, so an index with
TimeOperation
as the first column should address the bulk of queries against TimeOperation
.
- Do not add all columns blindly to an index, or even on included indexes - this will make the index page density worse and be counter productive (you would be duplicating your table in an index).
- If all data in your database centres around
TimeOperation
, you might consider building your clustered index around it.
- If you have queries just on
field1 = x
then you need a separate index just for field1
(assuming that it is suitably selective), i.e. no TimeOperation
on the index if its not in the WHERE clause of your query.
- Yes, you are right, when SQL locates a record in an index, it needs to do a key (or RID) lookup back into the cluster to retrieve the rest of the columns. If your non clustered index Includes the other columns in your
select
statement, the lookup can be avoided. But since you are using SELECT(*), covering indexes are unlikely to help .
Edit
Explanation - Selectivity and density are explained in detail here. e.g. iff your queries against TimeOperation
return only a small number of rows (rule of thumb is < 5%, but this isn't always), will the index be used, i.e. your query is selective enough for SQL to choose the index on TimeOperation
.
The basic starting point would be:
CREATE TABLE [MyTable]
(
intID INT ID identity(1,1) NOT NULL,
field1 NVARCHAR(20),
-- .. More columns, which may be selected, but not filtered
TimeOperation DateTime,
CONSTRAINT PK_MyTable PRIMARY KEY (IntId)
);
And the basic indexes will be
CREATE NONCLUSTERED INDEX IX_MyTable_1 ON [MyTable](TimeOperation);
CREATE NONCLUSTERED INDEX IX_MyTable_2 ON [MyTable](Field1);
Clustering Consideration / Option
If most of your records are inserted in 'serial' ascending TimeOperation order, i.e. intId and TimeOperation will both increase in tandem, then I would leave the clustering on intID (the default) (i.e. table DDL is PRIMARY KEY CLUSTERED (IntId)
, which is the default anyway).
However, if there is NO correlation between IntId
and TimeOperation
, and IF most of your queries are of the form SELECT * FROM [MyTable] WHERE TimeOperation between xx and yy
then CREATE CLUSTERED INDEX CL_MyTable ON MyTable(TimeOperation)
(and changing PK to PRIMARY KEY NONCLUSTERED (IntId)
) should improve this query (Rationale: since contiguous times are kept together, fewer pages need to be read, and the bookmark lookup will be avoided). Even better, if values of TimeOperation
are guaranteed to be unique, then CREATE UNIQUE CLUSTERED INDEX CL_MyTable ON MyTable(TimeOperation)
will improve density as it will avoid the uniqueifier.
Note - for the rest of this answer, I'm assuming that your IntId
and TimeOperations
ARE strongly correlated and hence the clustering is by IntId
.
Covering Indexes
As others have mentioned, your use of SELECT (*)
is bad practice and inter alia means covering indexes won't be of any use (the exception being COUNT(*)
).
If your queries weren't SELECT(*), but instead e.g.
SELECT TimeOperation, field1
FROM
WHERE TimeOperation BETWEEN x and y -- and returns < 5% data.
Then altering your index on TimeOperation
to include field1
CREATE NONCLUSTERED INDEX IX_MyTable ON [MyTable](TimeOperation) INCLUDE(Field1);
OR adding both to the index (with the most common filter first, or the most selective first if both filters are always present)
CREATE NONCLUSTERED INDEX IX_MyTable ON [MyTable](TimeOperation, Field1);
Either will avoid the rid / key lookup. The second (,) option will address your query where BOTH TimeOperation and Field1 are filtered in a WHERE or HAVING clause.
Re : What's the difference between index on (TimeOperation, Field1) and separate indexes?
e.g.
CREATE NONCLUSTERED INDEX IX_MyTable ON [MyTable](TimeOperation, Field1);
will not be useful for the query
SELECT ... FROM MyTable WHERE Field1 = 'xyz';
The index will only be useful for the queries which have TimeOperation
SELECT ... FROM MyTable WHERE TimeOperation between x and y;
OR
SELECT ... FROM MyTable WHERE TimeOperation between x and y AND Field1 = 'xyz';
Hope this helps?