3

I'm looking for a way to use CouchDB or BigCouch (or another "compatible" DB) in such a way that all revision history can be maintained or at least archived. I know that CouchDB internally does this anyway, and only deletes old revisions upon compaction. Since CouchDB/BigCouch are open source, I would imagine it would be possible to hack something together to enable this feature. For example, copying every revision to an archive DB before the compaction process deletes them.

As an aside: I worked at a couple companies that wanted an "audit history" of their SQL DB, and we implemented this by creating an "audit table" and we wrote triggers that inserted records into this table upon any other table being modified.

Can someone with more knowledge of CouchDB tell me how this could be done? I'm curious if anyone has done it before. It seems like it would be a very useful feature, so if hasn't been done before, I wonder why?

NOTE: this question is partially inspired by Dataomic, a DB that has the desired properties. So I'm looking for basically is an open-source, perhaps more lightweight alternative to Datomic.

Community
  • 1
  • 1
Otto
  • 1,675
  • 2
  • 19
  • 30

1 Answers1

0

I never like the idea of using internal versioning for maintaining a history. To me this is merely a requirement to support the eventual consistency functionality.

If I had to store a history, I'd look at a linked approach of documents where a link is an update. This way you can support features like:

  • create two or more new documents based on the same parent.
  • create one merged document that has two or more parents.

To support features like deletions, I would have a 'deleted document' that I would point documents that get deleted to.

New documents would get a unique ID (I use the uuid feature from couchdb for that), so I have a full list of free roots.

I find a graph database useful for this but you could just place the references to parents. I have something like this in the documents:

[some other content],
parent_nodes: [ list_of_parent_uuids], # these are the direct ancestors so you can build a graph.
origin_nodes: [ list of_origin_uuids] # these are the new_node uuids that the original documenst have. so you can build a view of all inheriting docs.
Hans
  • 2,800
  • 3
  • 28
  • 40
  • I like this idea, but I'm not clear on how this can be implemented in a way that's transparent to the user of the database, which is the question I was asking: I'm looking for a system that is automatic and doesn't require changing the client code. – Otto Aug 29 '14 at 17:41