I am considering storing changes/history-of-changes to database records as a git/hg repository, while current data still resides in database. If someone needs a history of changes, I will have to look it up in the repo.
There could be a folder for each collection, and filename would mirror "_id" (primary key). Hence, I will lookup the particular file for relevant info. Assumption is there is 1 primary key field (i.e no composites). Database I am using is mongodb, so records are in a document format anyways. I just need to store the JSON in a file.
Advantages to me are : Versioning systems are perfectly suited for storing/diffs/changes overtime. I can query specific versions, and list of changes too from base -> current.
Disadvantages: I can't think of any except it is novel. IO access could be an issue if history is highly sought after (but not in my case). I am ok with slower retrieval of historical data. It expected to be a rare event in anycase.
So my question is : Is there some obvious drawback which I am overlooking?
Thanks.
Clarification: The reason I want to do it this way is because, I expect history to grow and potentially be have to kept for an infinite amount of time (ideally). I could improve this by keeping on what has changed, but it is extra effort and not trivial.
An additional thing to consider is that the speed of mongodb comes from having indexes in memory. If it has to maintain indexes for both actual+history, I will need 2x-3x more RAM than I would otherwise need. As of now, it is a premium (may not always be), but still.