Questions tagged [surrogate-key]

A key in a database table, that has no intrinsic logical meaning, and was introduced for better physical organization of the database or other technical reasons.

Terminology

  • A "superkey" is any set of attributes that, when taken together, uniquely identify rows in the table.
  • A minimal1 superkey is called "candidate key", or just "key".
  • A "natural key" is a key that is comprised from attributes that have intrinsic meaning and would be present in the data model even if there was no natural key on top of them.
  • A "surrogate key", on the other hand, has no intrinsic meaning and is introduced purely for technical reasons, as discussed below.

Properties of Surrogate Keys

If there is a natural key in the model, it cannot be replaced with a surrogate key. A surrogate key can only be created in addition to any existing natural keys.

So the engineering decision is not on surrogate versus natural key, but on surrogate + natural key versus the natural key alone.

Having a surrogate key + natural key:

  • May make the FOREIGN KEYs in child tables slimmer.2
  • The surrogate never needs to change, and therefore never incurs ON UPDATE CASCADE referential action.
  • May be more friendly to object-relational mapping (ORM) tools.

Having only a natural key:

  • Makes the parent table slimmer.3
  • May play better with clustering.4
  • May make JOINs unnecessary in some situations.5
  • May be needed for correctly modeling certain kinds of diamond-shaped dependencies. For example, the following model guarantees that if B and C are connected to the same D, they must also be connected to the same A:

    enter image description here

    Note how A_ID gets propagated from the "diamond" top, down both sides and then is merged at the bottom.

Typical Implementations of Surrogate Keys

Most commonly, a surrogate key is implemented as an auto-incremented integer. Examples:

  • Oracle supports the SEQUENCE object, that can be used either directly in INSERT statement, or through an ON INSERT trigger.
  • MS SQL Server has the IDENTITY data type, and from the 2012 version, the explicit SEQUENCE object as well.
  • PostgreSQL supports the explicit SEQUENCE object, as well as the serial types that use sequences implicitly.
  • MySQL has the AUTO_INCREMENT attribute.

GUIDs or UUIDs are sometimes used when uniqueness must be guaranteed without a central "generator" for surrogate key values, such as in certain "disconnected" or replication scenarios.


1 That is, a superkey that would stop being unique (and therefore, being a superkey) if any of the attributes were removed from it.

2 Surrogates tend to use "slimmer" data types such as integers, versus "fatter" types such as strings that are often used in natural keys. Also, while it is not unusual for a natural key to be composite, there is almost never a reason to make a composite surrogate key. As a consequence, a FOREIGN KEY referencing surrogate key tends to be slimmer than a FK referencing natural key.

3 There is no need for the additional index "underneath" the surrogate key. Each new index incurs a maintenance cost for INSERT/UPDATE/DELETE operations and may be especially costly in clustered tables, where secondary indexes must typically contain the copy of the clustering key (which is often same as primary key) and may incur a double-lookup during querying.

4 It is often necessary to query the data on a range that is "aligned" with the natural key. Using the natural key as a clustering key physically orders the data in the table, which can dramatically diminish the I/O under some circumstances. On the other hand, surrogate keys typically do not poses an ordering that would be useful for clustering.

5 We can fetch the migrated natural key directly from the child table, without JOINing with the parent.

110 questions
1
vote
1 answer

JPA - Compound key vs generated id in Many to many table

I am creating a kind of social network and I have users that can follow other users. So I have an entity like: @Entity public class FollowedUser{ @ManyToOne private User user; @ManyToOne private User followedUser; //more…
Aliuk
  • 1,249
  • 2
  • 17
  • 32
1
vote
1 answer

Recommend datatype Number vs Varchar for a surrogate key

Can anyone of you please tell me which is the best data type for a surrogate key (number or varchar2)? I have read that surrogate key works best with the number type. But is the performance seriously have major impact with the data types. And is it…
Gunja
  • 11
  • 1
1
vote
1 answer

Generating surrogate keys remotely

Sorry in advance as this question is similar (but not the same!) to others. Anyway, I need to be able to generate surrogate keys in more than one location to be synchronized at a later time. I was considering using GUIDs, however these keys may have…
Tony Day
  • 2,170
  • 19
  • 25
1
vote
1 answer

Composite primary keys or surrogate primary key on relationship entities (entities that are also relations) in Doctrine2

I have a large database where I have entities that are relations. This exactly because of the following comment in the Doctrine 2 manual: I needed to store additional attributes in our relations and because of this the relation became an…
Wilt
  • 41,477
  • 12
  • 152
  • 203
1
vote
1 answer

Must Django ManyToManyField association tables have a surrogate key?

I'm mapping an existing database structure into Django models. I have a many-to-many structure where the association table is natural-keyed: CREATE TABLE foo (id INTEGER PRIMARY KEY); CREATE TABLE bar (id INTEGER PRIMARY KEY); CREATE TABLE foo2bar…
1
vote
1 answer

Database Design Has A relationship with Surrogate Key

First I want to be able to update all attributes/fields and there is no single unique key. I have a hierarchy like each PC can be categorized as PC1/PC2/PC3 can't give example and each PC has Network and Software Info. Design and examples are…
scc
  • 385
  • 1
  • 3
  • 12
1
vote
1 answer

how to perform update query using surrogate key

I am very new to database concepts and currently learning how to design a database. I have a table with below columns... this is in mysql: 1. Names - text - unique but might change in future 2. Result - varchar - not unique 3. issues_id - int - not…
Raju
  • 53
  • 5
1
vote
1 answer

What if I can not find a candidate key for my table (normalization)?

I am trying to understand how exactly normalization works and apply it in some specific project I have. I saw in several videos that the first step is to define a candidate key. I want to ask, what happens if none of the fields that I have (or a…
user1919
  • 3,818
  • 17
  • 62
  • 97
0
votes
1 answer

Oracle APEX Creating surrogate keys for composite PK

I have two tables I need to process. One table "Delivery" contains id of delivery as PK, id of storage to deliver as FK and id of storekeeper as FK. The other table "Delivery_log" contains composite primary key, which consists of id of delivery and…
Atemyn
  • 13
  • 3
0
votes
1 answer

How to lookup a surrogate key using two columns as business key in SSIS

I am working on a Data warehouse project; I have a design to follow a part of it is shown in the screenshot below: The source table is as follows: As the source table shows, the entity column is not unique in its own but unique if combined with…
Guissous Allaeddine
  • 425
  • 1
  • 4
  • 19
0
votes
0 answers

Can I add a surrogate key for a AWS redshift table?

I need to create and unique row in the redshift table. so is there a possibility to add the surrogate key to the redshift table?
0
votes
1 answer

How to populate fact table with Surrogate keys from dimensions?

Could someone please help me understand how to populate the fact table with Surrogate keys from dimensions using SSIS? I load my dimension tables and assign for each a surrogate key. I want to add these surrogate keys to my fact table but I don't…
chaneb
  • 67
  • 1
  • 6
0
votes
1 answer

Oracle UROWID or SEQUENCE

I need to create a surrogate identity key for some intermediate tables used in a stored procedure in Oracle. I found that ROWID inserted into a UROWID column works well but this is not the correct way in older versions of Oracle (before 10g) --…
user78706
0
votes
0 answers

Creating a hash key to serve as unique key

I have a very large table that has duplicate name and address information. This table feeds a process which performs a task and appends results back to the table. I'd like to reduce the volume of what's fed into this process by creating a hash key…
tad
  • 23
  • 5
0
votes
1 answer

Formatting the surrogate key based on other columns

I have a table in SQL Server create table student ( id int, batch varchar(10), branch varchar(10), registerNumber varchar(30) ) I want my registerNumber to be of type batch+branch+id eg a row in student id:1, batch:17, branch:BIT…