Context
We have a distrubuted system. We emit events from one of those systems which are read from another system for report generation.
Logical order is ensured by the fact that even if the emitter system has N nodes there is a finite state machine underlined which makes impossible to have concurrent emission of an event for one aggregate. These events are marked with a timestamp. N nodes could not always be on synch about the time.
We care so much about timestamp because the down-stream system which generates reports needs quite always a timestamp because "Reporting people" care about this kind data to check things are going the right way.
The problem
The fact 2 nodes could have a little discrepancy is making us thinking. Let's imagine the next example.
The logical order of the events is this:
Event 1 => Event 2 => Event 3
But in the Database we could have this situation:
-------------------------------------------
| Name | TimeStamp | Logical Order |
-------------------------------------------
| Event 1 | 2 | 1 |
| Event 2 | 1 | 2 |
| Event 3 | 3 | 3 |
-------------------------------------------
Has you can see, Event 2 is logically happened after the Event 1 but their timestamp could not be on synch.
Ok, this is not going to happen every 2 seconds but it could happen because the timestamp comes from different nodes. And from a Reporting point of view this is an anomaly.
Possible solutions
- Make Reporting people aware of the possible problem. We are not able to have one global source of time (NTP is not a good solution for some good reasons) so if there are discrepancies of a very small amount of time is not a problem and it means that "this event is happened around that time".
- Ensure timestamp consistency checking that the next event in the logical flow could not have a timestamp which is less that the previous event making them equals. This is not the truth but keeps the flow consistent even from a non developer point of view.
Have you got experiences on this topic?