4

I'm designing a system where questions like "what was the sum of all records matching certain criteria, at a certain time" are important.

Multi-value concurrency control (MVCC) seems like the way to go here, since it keeps an audit trail forever.

However, it would be nice if the data storage could handle this for me, rather than cobbling it together out of other database features. Now, Oracle and CouchDB other database engines use MVCC, but only behind the scenes. They use it to resolve conflicts or to decide what to do when a long-running query encounters recently updated data. But I don't know of any systems that allow you to explicitly say "in the system state of 17:00 January 20 2009, what does this query return". So are there such systems out there? Ideally, open source?

NeilK
  • 778
  • 1
  • 6
  • 15

2 Answers2

1

Take a look at flashback queries with Oracle.

Karl
  • 3,312
  • 21
  • 27
  • Oh, that's nice. But flashback still seems to be a way of peeking into the internal MVCC representation. You are able to set a timer so it will definitely keep old versions around for a while, but in the examples I see it's relatively short, like a few hours. The examples I have seen show this as a way to recover deleted data. – NeilK May 23 '11 at 00:03
  • Yeah, I've used it to recover from data deletions but there is no reason I know of not to use it for other reasons. How long you could retain data for would depend on the disc space available (flash back recovery space) versus change volumes. – Karl May 23 '11 at 01:24
1

As far as I know, MVCC doesn't guarantee the duration of earlier versions of rows. Its target is concurrency control for transactions; when all transactions for row 'x' have been committed, there's no real need for MVCC to keep earlier versions of that row.

You're thinking of Temporal Databases. Snodgrass's old book Developing Time-Oriented Database Applications in SQL is worth skimming, especially now that it's available as a PDF. If I had to do a temporal database from scratch again, I'd also read Temporal Data & the Relational Model, and anything else dealing with 6NF. (Where '6NF' <> 'DKNF'.)

Mike Sherrill 'Cat Recall'
  • 91,602
  • 17
  • 122
  • 185