0

So for context, I have a recorded event stream published from an Aeron Cluster for API gateways to consume. This is separate from the clients egress channel. For any successful state change, the cluster will publish an event to the event stream. The gateways will then consume these events and build their own state which are saved in a persistent storage. As such, its vital that I do not lose any events on the stream and that each node in the cluster should have the same sequence of events in their stream

My implementation is pretty straightforward, which is to maintain individual event stream per cluster node, each node will publish events to its own event stream. Cluster clients can then choose which node to consume the events from. However, this implementation introduces duplication of events into the stream when any of the cluster nodes restarts. One workaround I can think of is to truncate the recording up to the position of the latest event saved in the snapshot + introduce sequence number into every event which allows the cluster clients to perform data deduplication on their side. Feels quite hacky, which got me to wonder if there is a better way of implementing persisted event streams that are consistent across nodes with aeron cluster. Appreciate any feedback !

David Teh
  • 61
  • 5

1 Answers1

0

There are a few approaches to ensure the stream is consistent across nodes are there is no duplication on replay.

The simplest approach is to store the event stream position you've reached in the snapshot. On recovery, you can truncate the recording back to that position (or zero if you have no snapshots) and then events are repopulated as the log is replayed.

James
  • 388
  • 3
  • 7