-1

I have a temporal table that uses a natural primary key. This table stores a CSV file that was imported, it's a history table to keep track of every import. I starting to see some duplicates (in the CSVs I import) of the primary key field and would like to keep these records without loosing the functionality of the history table. I am currently manually removing rows from the CSV before import.

Is it possible to have a primary key consisting of a natural and surrogate key? Natural key is already used and an auto-incrementing key that only increments if the a duplicate natural id exists.

Daniel
  • 19
  • 4
  • 3
    *"I starting to see some duplicates of the Primary Key field"* This is impossible. A Primary Key is by definition unique, so there *cannot* be duplicates. – Thom A Jan 25 '22 at 17:16
  • 1
    'in the CSV' The rows are deleted from the CSV before the import, but would prefer to keep both records. – Daniel Jan 25 '22 at 17:44
  • You can't enforce a natural key in a history table as the "entity" identified by that key tuple changes over time. A simple example is , , . The natural key "Bob" cannot be enforced since you want (as best I can tell) to capture all those rows. You will need an artificial key as a unique identifier. In general, it is the actual entity table that enforces the natural key and the history table simply has a foreign key to former. Something seems fishy here. – SMor Jan 25 '22 at 19:59

1 Answers1

1

First, yes, you choose what the PK includes, so you can have it include both a natural and a surrogate key.

This part of your thought is strange to me: "auto-incrementing key that only increments if the a duplicate natural id exists". What leads you to ask this? You already admit the natural key alone cannot cover you. Why go to great lengths to avoid having a surrogate value if the natural one is not unique? Even then, you would just be able to use some obscure trigger to change the value to, for example, zero; it would still need to have a value, as part of the PK.

If what you have in mind is performance via the natural key, sure, just make the PK's first column the natural key and second one the surrogate, so you can benefit from the clustered index.

George Menoutis
  • 6,894
  • 3
  • 19
  • 43
  • The reason I was "going to great lengths to avoid a surrogate value" was because the more surrogate values the more records in the history tables. Following your idea here, and sorting the data before updating the table reduced the records in the history table and made use of performance. – Daniel Feb 09 '22 at 22:05