I'm trying to design a tree schema in Postgres (not an adjacency list, not a directed acyclic graph). I'm having trouble getting the behaviors I assume are correct from the schemas that I've tried so far. I'm open to corrections in both my assumptions and schemas!
I started with something simple that should enforce the parent_id actually exists.
CREATE TABLE node (
id SERIAL PRIMARY KEY,
parent_id INT REFERENCES node (id) ON DELETE CASCADE
);
I like that I can insert NULL as the parent_id for the root node (that makes some recursive queries easier later).
INSERT INTO node (parent_id) VALUES (null),(1),(1),(2);
id parent_id
-- ---------
1 [null]
2 1
3 1
4 2
But, I can also create unlimited NULL parent_id references, which is definitely not desirable.
INSERT INTO node (parent_id) VALUES (null),(null),(null);
id parent_id
-- ---------
1 [null]
2 [null]
3 [null]
I could also have the root node reference itself (is that useful)?
INSERT INTO node (parent_id) VALUES (1),(1),(1),(2);
id parent_id
-- ---------
1 1
2 1
3 1
4 2
Oh, but that means I can guess the next sequence on any INSERT, huh? That doesn't seem desirable.
INSERT INTO node (parent_id) VALUES (1),(2),(3)
id parent_id
-- ---------
1 1
2 2
3 3
So, maybe I need to investigate that REFERENCE? The documentation seems to suggest either MATCH FULL or a NOT NULL constraint.
A value inserted into the referencing column(s) is matched against the values of the referenced table and referenced columns using the given match type. There are three match types: MATCH FULL, MATCH PARTIAL, and MATCH SIMPLE (which is the default). MATCH FULL will not allow one column of a multicolumn foreign key to be null unless all foreign key columns are null; if they are all null, the row is not required to have a match in the referenced table. MATCH SIMPLE allows any of the foreign key columns to be null; if any of them are null, the row is not required to have a match in the referenced table. MATCH PARTIAL is not yet implemented. (Of course, NOT NULL constraints can be applied to the referencing column(s) to prevent these cases from arising.)
https://www.postgresql.org/docs/9.6/sql-createtable.html
Let's try NOT NULL.
CREATE TABLE node (
id SERIAL PRIMARY KEY,
parent_id INT NOT NULL REFERENCES node (id) ON DELETE CASCADE
);
But, now I can't use NULL or a self-reference to create the root node... what gives?!