Pros and cons will vary depending on the performance requirements and how often you will query this table.
As a first example think about the following:
CREATE TABLE Part_Price_Log (
ModifiedDate DATE,
PartID INT,
PRIMARY KEY (ModifiedDate, PartID))
If the ModifiedDate
is first and this is an logging table with insert-only rows, then every new row will be placed at the end, which is good (reduces fragmentation). This approach is also good when you want to filter directly by ModifiedDate
, or by ModifiedDate
+ PartID
, as ModifiedDate
is the first column in the primary key. A con here would be searching by PartID
, as the clustered index of the primary key won't be able to seek directly the PartID
.
A second example would be the same but inverted primary key ordering:
CREATE TABLE Part_Price_Log (
ModifiedDate DATE,
PartID INT,
PRIMARY KEY (PartID, ModifiedDate))
This is good for queries by PartID
, but not much for queries directly by ModifiedDate
. Also having PartID
first would make inserts displace data pages as inserted PartID
is lower than the max PartID
(which increases fragmentation).
The last example would be using a surrogate primary key like an IDENTITY
.
CREATE TABLE Part_Price_Log (
LogID BIGINT IDENTITY PRIMARY KEY,
ModifiedDate DATE,
PartID INT)
This will make all inserts go last and reduce fragmentation but you will need an additional index to query your data, such as:
CREATE NONCLUSTERED INDEX NCI_Part_Price_Log_Date_PartID ON Part_Price_Log (ModifiedDate, PartID)
CREATE NONCLUSTERED INDEX NCI_Part_Price_Log_PartID_Date ON Part_Price_Log (PartID, ModifiedDate)
The con about this last one is that insert operations will take longer (as the index also has to be updated) and the size of the table will increase due to indexes.
Also keep in mind that if your data allows for multiple updates of the same part for the same day, then using compound PRIMARY KEY
would make the 2nd update fail. Your choices here are to use a surrogate key, use a DATETIME
instead of DATE
(will give you more margin for updates), or use a CLUSTERED INDEX
with no PRIMARY KEY
or UNIQUE
constraint.
I would suggest doing the following. You only keep one index (the actual table, as it is clustered), the order is always insert, you don't need to worry about repeated ModifiedDate
with same PartID
and your queries by date will be fast.
CREATE TABLE Part_Price_Log (
LogID INT IDENTITY PRIMARY KEY NONCLUSTERED,
ModifiedDate DATE,
PartID INT)
CREATE CLUSTERED INDEX NCI_Part_Price_Log_Date_PartID ON Part_Price_Log (ModifiedDate, PartID)