6

Need to keep track of changes done to objects in database.

Trivial implementation would be to have mirror table that gets records inserted into it by triggers, either inside database or within application, but that affects performance and over time mirror database gets huge and essentially doubles maintenance time when original table has to be altered(mirror table needs to reflect that change).

Since my biggest requirement here is to have minimal effect on database and application performance my current preference is to dump changes into syslog-ng over udp and store them in plain text files.

After all changelog is not something that will be accessed frequently so it's even ok to have it archived over time. But obviously with such setup actual accessing that data is quite tricky.

So i guess my question is - is there already a system that at least partially suites my needs? Perfect fit would be UDP-accessed schema-less append-only database system with possibility to automatically archive data(or at least minimal amount of configuration needed for doing that) or very slow degradation of insert performance. MongoDB? CouchDB? YourDB?

Community
  • 1
  • 1
keymone
  • 8,006
  • 1
  • 28
  • 33
  • an array of changes(incremental update) saved as serialized object or json probably where you can also array merge to see the current merged changes. – mdprotacio Feb 06 '12 at 17:42
  • yeah, that's what i intend to do, question is what storage to use so it's easy to maintain it and time-slice it. – keymone Feb 28 '12 at 13:20
  • if the question is the storage, you could store it anywhere, either a file, a field on a database, or a persistent cache. It's your call. – mdprotacio Feb 29 '12 at 11:06

6 Answers6

3

Well, there are lots of ways to approach this. I am most familiar with MongoDB and so would lean in that direction. In general I think it will meet your needs for performance, and using a replica set, with reads coming off the slaves would likely be the approach to take. However, the versioning is not built in. You can see one approach to versioning with Mongoid::Versioning here:

Mongoid::Versioning - how to check previous versions?

The other solutions you mentioned may have better native support, but I can't speak to that. Hopefully this at least gives you some pointers on the MongoDB side of things.

Community
  • 1
  • 1
Adam Comerford
  • 21,336
  • 4
  • 65
  • 85
  • how would i solve archiving issue here? wouldn't moving data from one collection to another block it for writes(inserts)? also i'm looking for more log-like solution and mongodb seems to provide too many features that i don't need in this case. – keymone Feb 21 '12 at 11:12
1

Have a look at mongoid history

It tracks the history of of changes like what, when, by whom along with version. Its also provided with configuration options

Sandip Ransing
  • 7,583
  • 4
  • 37
  • 48
1

RavenDB has this feature native (but might be too young as NoSQL db for production needs - up to you of course)

http://ravendb.net/docs/server/bundles/versioning

http://www.slideshare.net/jwoglamott/battle-of-nosql-stars-amazons-sdb-vs-mongodb-vs-couchdb-vs-ravendb

If you want to go for MongoDB, two implementation strategies are suggested in this thread

Strategy 1: embed history will not impact your write performances and read if you tweak your code to avoid returning the history when not necessary, however you have the 16Mb limitation for one documents (might be blocker for you or not). Strategy 2: write history to separate collection requires (plainly) two operations. I agree as said there that these (or a combination) are the strategies available in MongoDB.

CouchDB is using internally an MVCC approach (and you could leverage for it as suggested here), but in SO this kind of approach is debated. There is a question on this topic, and the proposed solution is similar to the embedded strategy described above for MongoDB (so you should pick the one you prefere).

Community
  • 1
  • 1
1

For simple purposes (MySQL!), just do an AFTER UPDATE trigger on the tables you'd like to keep record of.

For example for table cars with fields

carId (primary key) color manufacturer model

create a table 'cars_history' (or equal name) with fields: carId field old_value new_value

and add an AFTER UPDATE trigger like this:

delimiter //

CREATE TRIGGER trigger_changes
AFTER UPDATE ON cars
FOR EACH ROW
BEGIN
IF OLD.manufacturer <> NEW.manufacturer THEN
  INSERT INTO cars_history
  ( carId, field, old_value, new_value)
  VALUES
  (OLD.carId, 'manufacturer', OLD.manufacturer, NEW.manufacturer);
ELSE IF OLD.color <> NEW.color THEN
  ...
END IF;
END;//
delimiter ;

untested, so may contain syntax errors :) hope that helps anyway!

Stefan
  • 2,028
  • 2
  • 36
  • 53
0

What about SQLite? Each DB is a self-contained file that you can easily rename and move when archiving. If the file is renamed or moved automatically other is created on next insert.

The only issue about SQLite is concurrent writes, which need block the file for writing. It can do about 60 transactions per second, but you can do thousands of inserts in one transaction (see doc).

Sony Santos
  • 5,435
  • 30
  • 41
  • it's good idea but unfortunately sqlite is not schema-less. models change and i don't want to spend too much time maintaining versioning/changelog system. – keymone Feb 22 '12 at 16:39
0

I wonder if this is the type of solution you're looking for: http://www.tonymarston.net/php-mysql/auditlog.html

It appears to be a very simple, elegant solution with a small data footprint and I would expect it to have minimal impact on insert times as well.

Rose
  • 156
  • 5