0

I'm trying to design a tree schema in Postgres (not an adjacency list, not a directed acyclic graph). I'm having trouble getting the behaviors I assume are correct from the schemas that I've tried so far. I'm open to corrections in both my assumptions and schemas!

I started with something simple that should enforce the parent_id actually exists.

CREATE TABLE node (
    id SERIAL PRIMARY KEY,
    parent_id INT REFERENCES node (id) ON DELETE CASCADE
);

I like that I can insert NULL as the parent_id for the root node (that makes some recursive queries easier later).

INSERT INTO node (parent_id) VALUES (null),(1),(1),(2);

id  parent_id
--  ---------
1   [null]
2   1
3   1
4   2

But, I can also create unlimited NULL parent_id references, which is definitely not desirable.

INSERT INTO node (parent_id) VALUES (null),(null),(null);

id  parent_id
--  ---------
1   [null]
2   [null]
3   [null]

I could also have the root node reference itself (is that useful)?

INSERT INTO node (parent_id) VALUES (1),(1),(1),(2);

id  parent_id
--  ---------
1   1
2   1
3   1
4   2

Oh, but that means I can guess the next sequence on any INSERT, huh? That doesn't seem desirable.

INSERT INTO node (parent_id) VALUES (1),(2),(3)

id  parent_id
--  ---------
1   1
2   2
3   3

So, maybe I need to investigate that REFERENCE? The documentation seems to suggest either MATCH FULL or a NOT NULL constraint.

A value inserted into the referencing column(s) is matched against the values of the referenced table and referenced columns using the given match type. There are three match types: MATCH FULL, MATCH PARTIAL, and MATCH SIMPLE (which is the default). MATCH FULL will not allow one column of a multicolumn foreign key to be null unless all foreign key columns are null; if they are all null, the row is not required to have a match in the referenced table. MATCH SIMPLE allows any of the foreign key columns to be null; if any of them are null, the row is not required to have a match in the referenced table. MATCH PARTIAL is not yet implemented. (Of course, NOT NULL constraints can be applied to the referencing column(s) to prevent these cases from arising.)

https://www.postgresql.org/docs/9.6/sql-createtable.html

Let's try NOT NULL.

CREATE TABLE node (
    id SERIAL PRIMARY KEY,
    parent_id INT NOT NULL REFERENCES node (id) ON DELETE CASCADE
);

But, now I can't use NULL or a self-reference to create the root node... what gives?!

Anthony Mastrean
  • 21,850
  • 21
  • 110
  • 188
  • See [this answer.](https://stackoverflow.com/a/54134588/1995738) Note that the case of `(1,1)` may be interpreted as a cycle. – klin Feb 13 '19 at 17:28
  • @klin oh, this is interesting... I'm going to have to look at the index and the trigger parts, I suppose. – Anthony Mastrean Feb 13 '19 at 17:34
  • The short answer as you figured out by now is: There is no such Tree data structure for a relational database. It is all about self references and controlling it in a programmatic way. – Jorge Campos Feb 13 '19 at 17:41
  • @AnthonyMastrean - Note that the cited answer has been updated. – klin Feb 13 '19 at 22:29

0 Answers0