Questions tagged [surrogate-key]

A key in a database table, that has no intrinsic logical meaning, and was introduced for better physical organization of the database or other technical reasons.

Terminology

  • A "superkey" is any set of attributes that, when taken together, uniquely identify rows in the table.
  • A minimal1 superkey is called "candidate key", or just "key".
  • A "natural key" is a key that is comprised from attributes that have intrinsic meaning and would be present in the data model even if there was no natural key on top of them.
  • A "surrogate key", on the other hand, has no intrinsic meaning and is introduced purely for technical reasons, as discussed below.

Properties of Surrogate Keys

If there is a natural key in the model, it cannot be replaced with a surrogate key. A surrogate key can only be created in addition to any existing natural keys.

So the engineering decision is not on surrogate versus natural key, but on surrogate + natural key versus the natural key alone.

Having a surrogate key + natural key:

  • May make the FOREIGN KEYs in child tables slimmer.2
  • The surrogate never needs to change, and therefore never incurs ON UPDATE CASCADE referential action.
  • May be more friendly to object-relational mapping (ORM) tools.

Having only a natural key:

  • Makes the parent table slimmer.3
  • May play better with clustering.4
  • May make JOINs unnecessary in some situations.5
  • May be needed for correctly modeling certain kinds of diamond-shaped dependencies. For example, the following model guarantees that if B and C are connected to the same D, they must also be connected to the same A:

    enter image description here

    Note how A_ID gets propagated from the "diamond" top, down both sides and then is merged at the bottom.

Typical Implementations of Surrogate Keys

Most commonly, a surrogate key is implemented as an auto-incremented integer. Examples:

  • Oracle supports the SEQUENCE object, that can be used either directly in INSERT statement, or through an ON INSERT trigger.
  • MS SQL Server has the IDENTITY data type, and from the 2012 version, the explicit SEQUENCE object as well.
  • PostgreSQL supports the explicit SEQUENCE object, as well as the serial types that use sequences implicitly.
  • MySQL has the AUTO_INCREMENT attribute.

GUIDs or UUIDs are sometimes used when uniqueness must be guaranteed without a central "generator" for surrogate key values, such as in certain "disconnected" or replication scenarios.


1 That is, a superkey that would stop being unique (and therefore, being a superkey) if any of the attributes were removed from it.

2 Surrogates tend to use "slimmer" data types such as integers, versus "fatter" types such as strings that are often used in natural keys. Also, while it is not unusual for a natural key to be composite, there is almost never a reason to make a composite surrogate key. As a consequence, a FOREIGN KEY referencing surrogate key tends to be slimmer than a FK referencing natural key.

3 There is no need for the additional index "underneath" the surrogate key. Each new index incurs a maintenance cost for INSERT/UPDATE/DELETE operations and may be especially costly in clustered tables, where secondary indexes must typically contain the copy of the clustering key (which is often same as primary key) and may incur a double-lookup during querying.

4 It is often necessary to query the data on a range that is "aligned" with the natural key. Using the natural key as a clustering key physically orders the data in the table, which can dramatically diminish the I/O under some circumstances. On the other hand, surrogate keys typically do not poses an ordering that would be useful for clustering.

5 We can fetch the migrated natural key directly from the child table, without JOINing with the parent.

110 questions
0
votes
1 answer

What are the disadvantages in using IDs suffixed by a table identifying number?

I'm playing with: DROP TABLE users CASCADE; CREATE SEQUENCE users_id_seq MINVALUE 100; CREATE TABLE users ( id INTEGER NOT NULL PRIMARY KEY DEFAULT nextval('users_id_seq') * 100 + 1, ... ); ALTER SEQUENCE users_id_seq OWNED BY…
fadedbee
  • 42,671
  • 44
  • 178
  • 308
0
votes
1 answer

Should I use a surrogate key or use the natural keys?

I am building a list type to let users make lists of various things in my system. The list elements therefore need references to both the type and identifier of the things they will be pointing at. For instance, I will allow users to make lists of…
Zak
  • 24,947
  • 11
  • 38
  • 68
0
votes
1 answer

Surrogate keys for distributed systems

I am new to the world of big data, I would like to take a question regarding surrogate keys. Considering a distributed data system, creating surrogate keys with hash md5 seems interesting. At the same time, md5 uses attribute concatenation. My…
0
votes
1 answer

Using Surrogate Keys in Data Warehouse Pros and Cons

A surrogate key is a mechanism that exists in our books for years and I hate for bringing into discussion again. Everyone is talking about the benefits of using a surrogate key instead of a business key. Even Microsoft Analysis Services Tabular and…
Stavros Koureas
  • 1,126
  • 12
  • 34
0
votes
1 answer

Splitting a database with surrogate keys into physical and logical layer

I am rebuilding a big warehouse database which originally used natural keys, and now I want to switch to surrogate keys. Therefore, I am considering to split the database into a physical layer and a logical layer. In the physical layer, every…
0
votes
1 answer

Creating Surrogate Keys in Access

I have a project that I am working on where we have to create a registration system for a university in Access. One of the tables requires a surrogate key for one of the fields, and I am not sure how I can create a surrogate key. The column…
brent_mb
  • 337
  • 1
  • 2
  • 14
0
votes
1 answer

Oracle - Create Surrogate key for 300 million records

There is a table with millions of records which has duplicate records as well. what is the process of creating a new entity as surrogate key (which denotes sequence no.) E.g table structure col1 col2 101 A 101 A 101 B 102 A 102 B I would…
Sharanya
  • 1
  • 1
  • 1
0
votes
5 answers

How to avoid mistakes at Primary Key

Hi I'm a beginner at Databases, for this reason I want to ask you which atributes should I use as primary key to avoid mistakes: CREATE TABLE customer( name first_lastname street ZIP_code mobile_phone telephone …
apk
  • 37
  • 11
0
votes
0 answers

MySQL - "Prefixed IDs"?

Im working on a project, where I need to insert various "Products", BUT depending on the type (fixed amount of 3, won't change customer said) the "ID" should start with a certain number (2,3,4) While my first-thought was to simply use a (in the code…
dognose
  • 20,360
  • 9
  • 61
  • 107
0
votes
2 answers

Choosing a primary key,surrogate key, index in the mysql stock trading database

Database type: mysql Columns: Date,time,price1,qty1,price2,qty2 time will be in milliseconds number of records approx 5.5 million for a month. I cant choose date as primary key as it is not unique, but can choose date and time as combined but that…
ladz
  • 145
  • 9
0
votes
1 answer

Use index or replace values by surrogate ids and add a separate table?

Currently I've got a couple of pivot tables with this structure in mysql: user_id | organisation_id | type 4 3 external 4 3 internal 6 4 internal 7 4 external 6 5 external 7 5 external 7 6 internal As you can see, in my…
Jenssen
  • 1,801
  • 4
  • 38
  • 72
0
votes
2 answers

Surrogate-key in ViewModel instead of primary-key

Is there a best practice how to prevent showing autoincrement primary-key's (ID's) from database in the ViewModel, to not make them visible for end-user? I know there can be other unique fields on the table, which could be used. However, what if…
0
votes
0 answers

What is a Clustered Surrogate Key in a DW and what is it used for?

What is a Clustered Surrogate Key in a DW and what is it used for? What T-SQL code do I write to make a column in a dimtable a clustered surrogate key?
0
votes
2 answers

JPA, Mixed surrogate key with foreign key and sequence number

I've got two tables: DOCUMENT -------- DOC_ID (PK) . . . SECTION ------- DOC_ID (FK, PK) SECTION_NUM (PK) . . . Entries in the database might look like this: Document: DOC_ID | . . . -------------- 1 | . . . 2 | . . . Section: DOC_ID |…
superkruger
  • 369
  • 4
  • 10
0
votes
2 answers

Surrogate key in Oracle DB best practices

I have a few systems and I need to import data from these systems into a common table. But ID is unique only within source system. So I need to create my own primary key to uniquely identify this data in the target table. What is best practice to…
caruzo
  • 127
  • 1
  • 12