5

I'm creating a new database and plan to use temporal tables to log all changes. The data stored will be updated daily but will not be more than 5000 records per table

Is there any reason I shouldn't just make all tables temporal?

Ps. I am aware of the space usage of temporal tables, this is not as far as I understand a problem

  • 2
    Which [DBMS](https://en.wikipedia.org/wiki/DBMS) product are you using? "SQL" is just a query language, not the name of a specific database product (and not all databases support "temporal" tables). Please add a [tag](https://stackoverflow.com/help/tagging) for the database product you are using –  Oct 08 '19 at 12:25
  • Which dbms are you using? (Very product specific issue.) – jarlh Oct 08 '19 at 12:26
  • Added 2016 sql server tag, apologies – Ruanne Bornman Oct 08 '19 at 12:28

2 Answers2

4

I am aware of the space usage of temporal tables, this is not as far as I understand a problem

On the contrary - it's pretty big problem - and there are many other downsides too.

When you use Temporal Tables (at least in SQL Server), every UPDATE operation (even if the data is unchanged) results in a copy being made in the History table (granted, under-the-hood this may be a COW-optimized copy, but it's still another conceptual entity instance).

Secondly - from my personal experience working with LoB applications: most changes to databases are not important enough to justify creating an entire copy of a row, for example, imagine a table with 4 columns ( CREATE TABLE People ( FirstName nvarchar(50), LastName nvarchar(50), Address nvarchar(200), Biography nvarchar(max): whenever a typo in FirstName is fixed then all of the data in the other columns is copied-over, even if Biography contains a 4GB worth of text data - even if this is COW-optimized it's still creating copies for every user action that results in a change.

Is there any reason I shouldn't just make all tables temporal?

The main reason, in my experience, is that it makes changing your table schema much harder because the schemas (aka "table design") of the Active and History tables must be identical: so if you have a table with a NULL column that you want to change to a NOT NULL column and you have NULL values in your History table then you're stuck - at least until you write a data transformation step that will supply the History table with valid data - it's basically creating more work for yourself with little to gain.

Also, don't confuse Temporal Tables with Immutable, Append-only data-stores (like the Bitcoin Blockchain) - while they share similar design objectives (except true immutability) they exist to solve different problems - and if you consider the size requirements and scaling issues of the Ethereum block-chain (over a terabyte by now) then that should give you another idea why it's probably not a good idea.

Finally, even if Temporal Tables didn't have these issues - you still need to go through the effort to write your main software such that it can natively handle temporal data - and things like Entity Framework still don't have built-in support for querying Temporal Data.

...and even with all the historical records you've managed to save in the History table, what do you want it for? Do you really need to track every corrected typo and small, inconsequential change? How will your users react to needing to manually audit the changes to determine what's meaningful or not?

In short:

  • If your table design probably won't change much in the future...
  • AND small updates happen infrequently...
  • OR large updates happen regularly AND you need an audit record
  • ...then go ahead and use Temporal Tables wherever you can.
  • if not, then you're just creating more future work for yourself with little to gain.
Dai
  • 141,631
  • 28
  • 261
  • 374
  • Thanks, that actually clears up a lot, with regards to the copy of the row, will keep this in mind and only store columns that have small amounts of data. Thanks for clearing up the default aswell – Ruanne Bornman Oct 08 '19 at 12:40
  • 5
    The undertone of this response sadly makes the performance related aspects prevail over those of the additional expressive power you get from the temporal features in SQL. Lacking that expressive power when you're facing a need to keep and actively work with historical data is ***just as well*** a source of great pain for the developers. See the first chapters of "Time and Relational theory" for a detailed description of what that pain looks like. – Erwin Smout Oct 08 '19 at 14:31
3

"log all changes" is not a good use case for the temporal features in SQL.

The use case for the SYSTEM TIME temporal feature is when there is a pressing requirement obligating you/the user to be able to easily and quickly reconstruct (should be in scare quotes actually) the state that your database was in at a given moment in time. That is difficult and error-prone and expensive if all you have is a true log of past changes. But if you can suffice with keeping just a log of the changes, then do that (it will be difficult and error-prone and expensive to recreate past database states from the current state and your log, but that's not a pressing problem if there's no pressing need).

Also note that the SQL temporal features encompass also the notion of BUSINESS TIME, which is a different time dimension than the SYSTEM TIME one. Business time is targeted to keeping a history of the world situation, system time is targeted at keeping a history of your database itself, that is, a history of your records of the world situation.

Erwin Smout
  • 18,113
  • 4
  • 33
  • 52
  • I'm curious what your opinion is of a hypothetical JSON document-store database backed by a `git` repository, where every change results in a new commit. – Dai Oct 08 '19 at 20:42
  • 4
    Comment space too short to discuss in any reasonable level of detail, but my short advice is : don't waste your time. You'll be using a model of data vastly inferior to the relational one, and source version management systems are designed to handle only the number of commits that can reasonably be generated by the small number of humans who work on a system. Which isn't hundreds of thousands per day, as will be the case in any serious database installation. – Erwin Smout Oct 09 '19 at 12:10