Let's take a relational database, e.g. MySQL. To keep it simple, i will concentrate on the important things: Having a table that contains orders, with fields like order_id
(primary key) order_date
and a foreign key fk_supplier, which references a primary key in a table supplier
. This table also has a field called supplier_name
.
Now, lets imagine, there is a php-website that is showing all orders that were made in a table. Each row of the table consists of the order_id
, order_date
and the supplier_name (the sql-statement made a join over the two tables).
Everything is okay so far. Now, someone changes the name of one supplier that is referenced in one of the orders: the historical data becomes untrue or false.
My question is: What is the best practice, to prevent this? Three solutions come to my mind:
- Don't let the user change a supplier datarow, that is referenced in orders. Make him add a new supplier if the name changes.
- Always save the current supplier data (e.g. the supplier name) with the order record and don't use primary key / foreign key references.
- Introduce time-slices: every time, an important attribut of a supplier (like the name) is changed, create a new time slice. Reference not only the supplier_id in orders, but in addition the corresponding time-slice.
All this approaches have advantages and disadvantages. Point 2 e.g., seems pretty dirty and against all rules of a relational database. Point 3 would usually be the way to go, in my opinion. But needs a lot of effort, programming wise. The user experience / usability gets pretty bad, too.
I would like to hear, how experienced developers and database designers deal with that problem.