0

I'm using PostgreSQL 14 to maintain a table which acts as an append-only ledger of changes done to a students table:

CREATE TABLE students (
  id SERIAL PRIMARY KEY,
  name TEXT NOT NULL,
  city TEXT NOT NULL,
  mentor_of INTEGER REFERENCES students(id)
);

CREATE TABLE edits (
  id SERIAL PRIMARY KEY,
  student_id INTEGER REFERENCES students(id) NOT NULL,
  name TEXT,
  city TEXT,
  mentor_of INTEGER REFERENCES students(id)
);

Each row in the edits table references some student, the columns name, city and mentor_of give the new value to be applied to the respective column in the student table. A NULL value indicates that the field was not touched.

This works beautifully, but I lack a way to express that a column should be set to NULL. I.e. there is no way to have an edit row which indicates that the duplicate_of column of some student is to be set to NULL.

I considered adjusting the edits table such that each column is actually an array of one element. Thus, NULL indicates that e.g. the name was not touched, {NULL} indicates that the name was set to NULL and {Jill} indicates that the name was set to Jill. Unfortunately, arrays don't support foreign key constraints (yet?), so using this for the mentor_of column means weakening referential integrity.

I also looked into defining a composite type (like maybe_student_id which features one boolean field to indicate set/unset and the other field gives the new value to use). Alas, composite types don't seem to support foreign keys either.

Is there a way to have a (preferably generic, i.e. not type-specific) way to get an additional sentinel value on top of NULL?

Frerich Raabe
  • 90,689
  • 19
  • 115
  • 207

1 Answers1

2

You can add flags (as boolean columns) to indicate the changes. This way nulls will be applied as necessary. For example:

CREATE TABLE edits (
  id int not null PRIMARY KEY generated by default as identity,
  student_id INTEGER not null REFERENCES students(id),
  name varchar(100),
  city varchar(50),
  mentor_of INTEGER REFERENCES students(id),
  name_modified boolean, -- added
  city_modified boolean, -- added
  mentor_of_modified boolean -- added
);

Of course that pollutes the design. Other option is to consolitate all flags into a boolean array, as in:

CREATE TABLE edits (
  id int not null PRIMARY KEY generated by default as identity,
  student_id INTEGER not null REFERENCES students(id),
  name varchar(100),
  city varchar(50),
  mentor_of INTEGER REFERENCES students(id),
  modified boolean[] -- position 0 is for name, 1 for city, etc.
);

Another option is to consolidate everything into a single JSON column that indicates the changes. For example:

CREATE TABLE edits (
  id int not null PRIMARY KEY generated by default as identity,
  student_id INTEGER not null REFERENCES students(id),
  changes json
);

In this case if changes has the value {"name":"abc","city":null} that means that name and city are to be modified while mentor_of (not mentioned) should not be touched.

The Impaler
  • 45,731
  • 9
  • 39
  • 76
  • Thanks for the food of thought! Do you know if it's possible to have a JSON document using foreign keys, i.e. express a constraint which dictates that in a document like `{"mentor_id": 13}` there must be a row on the `students` table with the ID 13? – Frerich Raabe Feb 20 '23 at 14:35
  • @FrerichRaabe No, foreign keys can only apply to plain columns. If you want to enforce them, then the JSON option won't be useful. – The Impaler Feb 20 '23 at 19:45