3

How is the foreign key implemented, for example, in PostgreSQL?

I noticed that a lot of hashing is involved in the creation of a foreign key, so I suppose that a hash based index is created on the foreign key column that references the primary key column. If so (for instance, when we want to remove a row from the referenced table - this one with the primary key or the so called master table) we can easily check if the row from the referenced table is actually referenced or not. What's more, probably DBMS requires that there is at least a B+ tree index on the referenced primary key column, because when we want to insert a new row to the referencing table, we can easily check if a row with the required primary key value exists in the referenced table. Some sources claim that a trigger is used to ensure the foreign key constraint.

elixenide
  • 44,308
  • 16
  • 74
  • 100
ady
  • 1,108
  • 13
  • 19
  • 2
    A (foreign key) constraint is an *abstract* concept. Implementations may implement it anyway they want. In *practice*, for performance reasons it is *almost* necessary to add an index. For theoretical reasons it is *required* that the referenced flelds in the FK are *at least* an unique way to address a row in the target table. That *in practice* often implies an index. – wildplasser Mar 15 '15 at 23:57
  • Yes, I know that there is a b+tree index on the primary (or unique) key, but is there any additional structure (for example, a hash-based index) on the referencing column(s)? – ady Mar 17 '15 at 09:02

2 Answers2

2

In Postgres, the referenced column(s) in the master table need to have a UNIQUE or PRIMARY KEY constraint. Per documentation:

The referenced columns must be the columns of a non-deferrable unique or primary key constraint in the referenced table.

Both of which are currently always implemented with a btree index. So there is always a btree index on the referenced column(s):

The referencing column(s) do(es) not have to be indexed at all. If rows in the master table are never updated or deleted, this may be good enough. Else, the referencing column(s) should also be indexed, but that's not enforced by the system. This is just about performance optimization, not data integrity.

The actual implementation of the FK constraint itself is an entry in the system catalog pg_constraint, a special internal trigger and another entry in pg_depend.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Thank you for your answer. As far as I know the b+tree index is created by default when a new primary key is defined (so this is a CPU intensive task because the rows have to be sorted before creating the b+tree in a bulk mode). I observed that a lot of hashing was involved in the creation of the FK in postgres, so I infer that probably an additional structure (maybe a hash-based index) is created on the referencing column(s) apart from the new entries in the pg_constraint and pg_depend. – ady Mar 17 '15 at 09:01
  • 1
    @adam.cajf: No additional structure is created other than what I already listed. What you observe is the verification of relational integrity. Upon creation, Postgres has to establish that existing values obey the new constraint. – Erwin Brandstetter Mar 17 '15 at 12:06
  • Yes, it makes sense to check if the values exist, but in this case the b+tree on the primary key should be used. So, why there is a lot of hashing? – ady Mar 18 '15 at 15:34
  • 1
    @adam.cajf: Not sure what you mean. But an index would hardly be useful when visiting most of the table anyway. The index is probably *not* going to be used for this particular purpose. – Erwin Brandstetter Mar 18 '15 at 16:25
  • 1
    What you probably saw was the implementation of an outer join between the primary key and the foreign key to detect foreign key entries not in the primary key. – David Aldridge Apr 11 '15 at 19:17
0

I checked that, indeed, there is no index created by PostgreSQL for a foreign key (using this query: https://stackoverflow.com/a/25596855/1245175).

On the other hand, a few triggers are created for a foreign key:

test=# SELECT tgname AS trigger_name
  FROM pg_trigger
 WHERE tgname !~ '^pg_';
 trigger_name
--------------
(0 rows)

test=# ALTER TABLE LINEITEM ADD CONSTRAINT LINEITEM_FK1 FOREIGN KEY (L_ORDERKEY)  REFERENCES ORDERS;
ALTER TABLE
test=# SELECT tgname AS trigger_name                                                               
  FROM pg_trigger
 WHERE tgname !~ '^pg_';
         trigger_name        
------------------------------
 RI_ConstraintTrigger_a_16419
 RI_ConstraintTrigger_a_16420
 RI_ConstraintTrigger_c_16421
 RI_ConstraintTrigger_c_16422

So, I suppose that during the foreign key creation in PostgreSQL, a hash map is created for the referenced table and then a probing is executed for each row of the referencing table.

Interestingly enough, MonetDB creates indexes of different types for primary and foreign keys (probably join-index and hash-index, respectively).

sql>select * from sys.idxs;
+------+----------+------+-------------+
| id   | table_id | type | name        |
+======+==========+======+=============+
| 6467 |     6446 |    0 | orders_pk   |
| 6470 |     6464 |    1 | lineitem_fk |
+------+----------+------+-------------+
2 tuples (3.921ms)

What's more, Oracle enforces primary key constraints using indexes and by default it does not create any index for a foreign key, however, there are some foreign key indexing tips: https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:292016138754

Community
  • 1
  • 1
ady
  • 1,108
  • 13
  • 19