2

I have an application that use OrientDB as the persistence layer with great success and I have arrived at a cross roads while looking at implementing an audit trail.

IMO an audit should consist of 4 core properties:

  • Timestamp (When the action happened)
  • Actor (The entity performing the action)
  • Target (The entity being acted on)
  • Action (This can be made up of lots of subparts like pre and post states, descriptions etc. depending on your requirements.)

One one hand this type of relationship between the various entities is perfect for a graph database. On the other hand (again IMO) graph databases are not fantastically equipped / easy to manipulate when it comes to time.

I am now torn between 2 ideas:

Create a time series

Basically the idea is to create a cascading time series in the database that represents time in segmented vertex nodes.

enter image description here

I would create an Audit vertex that contains all my Actionable properties and I would then create an Edge relationships between the Audit and the closest time vertex node, between the Audit and the Actor and between the Audit and Target.

enter image description here

Create a straight relationship

The time series adds a level of complexity that needs to be considered. Assuming that audits need to be searched by 3 root criteria; by Actor, Target and Time. Because Time is pretty much always a requirement it does make the time series an appealing solution. Get me all the Audit for the past 30 days.

But if I want to search a query that goes along the lines of Get me all the Audit for the past 30 days where john was the Actor. I could conceivably just ignore the time relationship and just create the audit with an Edge to the Actor and Target with a timestamp.

enter image description here

Or the timestamp could be a property of the Edge as well!

Is there really any advantage to the time series or am I just adding unnecessary complication? Is there a better already accepted practice of auditing events in graph databases?

Jeen Broekstra
  • 21,642
  • 4
  • 51
  • 73
tarka
  • 5,289
  • 10
  • 51
  • 75

1 Answers1

0

using the time series graph will make sense especially if you use it to save pre-calculated aggregate data. This way, doing aggregate queries like "give me the number of actions in last three months" will be a very efficient operation.

Here you can find some examples about this http://www.slideshare.net/LuigiDellAquila/orientdb-time-representation

Luigi Dell'Aquila
  • 2,804
  • 10
  • 13