How to deal with lacking infinite retention in event driven architecture?

Question

I am reading “Building Event Driven Microservices”.

It states a requirement of the event broker is:

Infinite retention: Event streams must be able to retain events for an infinite period of time. This property is foundational for maintaining state in an event stream.

However this does not seem to be standardly available. For example, Azure Event Hubs has a maximum retention policy of 7 days, and Event Grid also has maximum retry and retention time policies.

What are techniques for coping with retention limitations of the event broker technology that is used?

score 3 · Answer 1 · answered Sep 04 '21 at 20:20

Well, it depends on the purpose of your events.

If you are using events to orchestrate actions between different services, you are using events to decouple different services from each other. Such an event contains everything needed for other services to do whatever they are suppose to do. In this case, there is no need in keeping those events in a store till the end of time. As soon as the service has handled the event, itself has determined the new state of its objects and most likely stored this updated state in some kind of service specific store.

If you are using events to record actions leading to a change in the state of something, then you are doing event sourcing and things become different. While doing event sourcing, you wish to replay the stream of events to determine the state at a certain point in time instead of storing the state at that time itself. This implies that you want to replay this stream of events at any time (when you start an new instance of a service for example). In this case, events should be stored "forever".

One possible way around is to create snapshots. A snapshot represents the state of an object at a certain point in time. New actions are stored again as a stream of events in the event store, older events could be discarded. Before the retention period ends, you should update your snapshot to represent the new current state and again discard events happening before the snapshot. If you need to replay your events, you start at the known snapshot and process all succeeding events to know the new state of the object(s). One important note, using snapshots has the side effect of loosing all details that lead to a state at a point in time.

Another solution is the use of a different event store that allows you to specify the retention period...

Different frameworks exist for the event sourcing approach and "snapshotting". You should probably take a look at solutions like AxonIQ and Eventuate.

One could use events to do streaming processing. In this case you "just" want to record events at high speed like in typical IoT solutions where you capture sensor data. After ingesting this data, you want to do calculations on this stream of events, hence the name streaming processing. If you want to keep those events till the end of time depends on the use case of the project. What you could do if you are not allowed to change the retention period, is store raw events in a separate, dedicated store after ingesting them. This store is the basis for further event processing. For such use cases, you could take a look at Apache Kafka.

Hope this clears up some of your questions...

I guess for commands, if your queue does not have infinite retention you would have to expect a response message or else retry sending the command? And for event sourcing, the service could host its own store and snapshot/replay events API, just using the queue as an async communication mechanism. In the latter, the consumer may have to periodically check the snapshot to detect missed messages? Or notice missing sequence numbers. — MattHH, Sep 05 '21 at 03:19
Events and commands are 2 different things. A command expresses the intention to update the aggregate's state. A command handler verifies if the update is allowed given the aggregate's current state. If so, an event is dispatched and stored in the event store of the event sourcing framework/solution used. Because of this commands are not stored in the event store but are dispatched by the message broker. The broker's limited retention for handling command messages is fine since it only needs to take care to pass the command to the aggregate for further processing. For events, it is different. — KDW, Sep 05 '21 at 20:14
Hosting the event store and snapshot/replay API inside the service seems odd unless you are building your own ES solution/framework. When using an "external" ES framework, it is the task of the framework to find the aggregate while processing commands and store/distribute events emitted by aggregates to all interested event handlers. The framework hosts the event store which is the source of truth for all other services. Services host and keep their own read stores up to date typically for service specific queries based on received events. One seldom queries the event store directly. — KDW, Sep 05 '21 at 20:24
It is also up to the ES framework to use a snapshot policy to keep snapshots up to date and process missed events when the policy dictates it. Your consumer should not be bothered by this task unless again you build your own ES framework. Then you are absolutely right. — KDW, Sep 05 '21 at 20:28
I feel like the question here is: Is there an existing ES framework that satisfies the need for replay and brokering. Or in the case where the historic event persistence is limited, have the framework handle snapshotting. — Lee McDonald, Sep 13 '21 at 19:37

score 1 · Answer 2 · answered Sep 05 '21 at 06:17

There are event streaming solutions that offer infinite retention, although the only one I know of is Kafka (using certain configurations / solutions $$).

To say that "event streams must be able to retain events for an infinite period of time" is rather arbitrary. There's no universal rule that says all event streaming problems require infinite retention. Also, you have to ask yourself: if that definition is true, why do platforms like Azure not offer it?

Your questions not a bad one but I wouldn't stress about it.

What are techniques for coping with retention limitations of the event broker technology that is used?

Persist the events in another system that can return them, but not necessarily as "events". E.g. stored in a database (or database backed microservice) so that the historical event data is available via query.

Basically you'd be having two components that did a similar job but targeting very different non-functional requirements: a system optimized for providing current data fast, and another which is optimized for long-term retention. Not unlike AWS's Glacier, which is aims to provide low-cost retention by trading-out speed.

Note though that it's probably a "roll-your-own" approach unless you use something like what MattHH mentioned.

How to deal with lacking infinite retention in event driven architecture?

2 Answers2