Questions tagged [surrogate-key]

A key in a database table, that has no intrinsic logical meaning, and was introduced for better physical organization of the database or other technical reasons.

Terminology

  • A "superkey" is any set of attributes that, when taken together, uniquely identify rows in the table.
  • A minimal1 superkey is called "candidate key", or just "key".
  • A "natural key" is a key that is comprised from attributes that have intrinsic meaning and would be present in the data model even if there was no natural key on top of them.
  • A "surrogate key", on the other hand, has no intrinsic meaning and is introduced purely for technical reasons, as discussed below.

Properties of Surrogate Keys

If there is a natural key in the model, it cannot be replaced with a surrogate key. A surrogate key can only be created in addition to any existing natural keys.

So the engineering decision is not on surrogate versus natural key, but on surrogate + natural key versus the natural key alone.

Having a surrogate key + natural key:

  • May make the FOREIGN KEYs in child tables slimmer.2
  • The surrogate never needs to change, and therefore never incurs ON UPDATE CASCADE referential action.
  • May be more friendly to object-relational mapping (ORM) tools.

Having only a natural key:

  • Makes the parent table slimmer.3
  • May play better with clustering.4
  • May make JOINs unnecessary in some situations.5
  • May be needed for correctly modeling certain kinds of diamond-shaped dependencies. For example, the following model guarantees that if B and C are connected to the same D, they must also be connected to the same A:

    enter image description here

    Note how A_ID gets propagated from the "diamond" top, down both sides and then is merged at the bottom.

Typical Implementations of Surrogate Keys

Most commonly, a surrogate key is implemented as an auto-incremented integer. Examples:

  • Oracle supports the SEQUENCE object, that can be used either directly in INSERT statement, or through an ON INSERT trigger.
  • MS SQL Server has the IDENTITY data type, and from the 2012 version, the explicit SEQUENCE object as well.
  • PostgreSQL supports the explicit SEQUENCE object, as well as the serial types that use sequences implicitly.
  • MySQL has the AUTO_INCREMENT attribute.

GUIDs or UUIDs are sometimes used when uniqueness must be guaranteed without a central "generator" for surrogate key values, such as in certain "disconnected" or replication scenarios.


1 That is, a superkey that would stop being unique (and therefore, being a superkey) if any of the attributes were removed from it.

2 Surrogates tend to use "slimmer" data types such as integers, versus "fatter" types such as strings that are often used in natural keys. Also, while it is not unusual for a natural key to be composite, there is almost never a reason to make a composite surrogate key. As a consequence, a FOREIGN KEY referencing surrogate key tends to be slimmer than a FK referencing natural key.

3 There is no need for the additional index "underneath" the surrogate key. Each new index incurs a maintenance cost for INSERT/UPDATE/DELETE operations and may be especially costly in clustered tables, where secondary indexes must typically contain the copy of the clustering key (which is often same as primary key) and may incur a double-lookup during querying.

4 It is often necessary to query the data on a range that is "aligned" with the natural key. Using the natural key as a clustering key physically orders the data in the table, which can dramatically diminish the I/O under some circumstances. On the other hand, surrogate keys typically do not poses an ordering that would be useful for clustering.

5 We can fetch the migrated natural key directly from the child table, without JOINing with the parent.

110 questions
0
votes
1 answer

What happens to surrogate keys of transactional system , when converting it to dimensioanl schema?

Our OLTP systems use several surrogate keys .Now we want to create a dimensional model for our system for analysis. Should we keep OLTP system surrogate keys and natural keys and also create one more datamart surrogate key? or shall we ignore the…
0
votes
1 answer

Is there a way to auto increment a column with respect to the foreign key in DB2?

Say I have an address table with the addresses of different facilities of a manufacturing company. The foreign key lets me know which company the addresses belong to, but i need a surrogate id to differentiate between each facility. This id should…
Ash
  • 33
  • 2
  • 9
0
votes
2 answers

Identifying a Surrogate Key

I am modeling as practice an Airline Agency. I have a table "Passenger" CREATE TABLE Passenger ( confirmationNum INTEGER NOT NULL, flightNum INTEGER NOT NULL, seatNum INTEGER NOT NULL, name VARCHAR(30)…
JCCS
  • 591
  • 8
  • 22
0
votes
1 answer

Why surrogate keys are needed?

I am reading about DW modeling and started wondering why surrogate keys are used at all? I understand that sometimes business keys are not integers nthat makes the life (as well as joiing and indexing) harder. However, what I do not understand is…
aviad
  • 8,229
  • 9
  • 50
  • 98
0
votes
2 answers

Database Design - How to enforce a foreign key constraint on a table with a Surrogate Key

My table schema is something like this 1. Master table : Clause Columns ClauseID = surrogate pk (identity) ClauseCode = nvarchar user specified value Class = nvarchar FK to a master class table etc... ClauseCode + Class = candidate key for this…
Jagmag
  • 10,283
  • 1
  • 34
  • 58
0
votes
2 answers

Creating custom surrogate key in pig

Is there any way to create a custom surrogate key in Pig?. Ex: we have data like below Salary City Name 20000 newyork john 30000 sydney joseph 60000 delhi mike 30000 sydney joseph And for this data we need to create the surrogate key as…
Avinash
  • 127
  • 2
  • 13
0
votes
1 answer

Database design - children of join tables key choice

I searched for answers to the following, but got answers mostly related to whether or not to use a surrogate at all, not about whether or not to then use compound keys on children of join tables. I have a table attendees and a table events. The join…
Squadrons
  • 2,467
  • 5
  • 25
  • 36
0
votes
1 answer

Table design for an ordered tree with composite keys

Which table design is prefered when considering database normalization rules: Consider the following designs for a node table that describes an ordered tree. The key is a composite key. ck: composite key parent: parent node next: the next node,…
Wouter
  • 2,540
  • 19
  • 31
0
votes
1 answer

Surrogate key from all column hash

I would like to create a surrogate key for a hive table, but one that could be replicated every time the data was put in the table. Other tables would reference this table through the surrogate key, and the table could be regenerated to add more…
aaron
  • 854
  • 2
  • 12
  • 23
0
votes
1 answer

Hibernate composite-key or surrogate-key

I have to design a table for storing some remote data. The data I am getting, via web service, has a candidate key, combination of two columns, but I have seen few posts where they discourage using composite-key in hibernate instead recommending to…
msmani
  • 730
  • 1
  • 7
  • 24
0
votes
3 answers

MySQL: Using natural primary index or adding surrogate when tables are given

I have 5 text-fields which I want to import into a MySQL/MariaDB database. But there are two problems: (1) The files are quite large: 0.5 GB to 10 GB (2) All relevant keys have 40 characters Point (1) I have to accept as it is and I can't change…
giordano
  • 2,954
  • 7
  • 35
  • 57
0
votes
1 answer

how to match the two load statements in pig

I have two load statements A and B. In each one I have a surrogate key. I want to match the surrogate key columns if both keys will match the stored data. I tried the following code. A = LOAD 'a/data/' using PigStorage('\t') as…
dazzles dina
  • 23
  • 1
  • 5
0
votes
1 answer

surrogate key implementation in fact table

i am so new to warehousing i have problem in implementing surrogate key. for example i have a Customer Dimension table which has a surrogate key for each customer in a specific region. like this: (SK_ NK_ Customer_ Region) ( 1 , …
mana
  • 1
  • 1
0
votes
1 answer

MySQL - Trigger - Before Insert and using the SK (Auto Increment)

I have a simple posts table in MySQL which has a POST_ID as the SK (surrogate key). Replies to the original post ID are stored in the same table in a PARENT_POST_ID column, but I want to perform the following logic: BEFORE INSERT (I think ...) IF a…
ash
  • 1,224
  • 3
  • 26
  • 46
0
votes
1 answer

Adding Existing Index in a table (which is made of two fields) as Primary Index? ax 2012

I have an index TestIndex in my table with two fields: Code1 and Code2. Index Properties: Allow Duplicates:No, AlternateKey: Yes Field Properties: Mandatory:Yes AllowEdit:No When I got to Table Properties PrimaryIndex:SurrogateKey. I want to change…