0

The thing I can't get my head around is this:
The CQRS model tells me that the write and read models are independent of each other.
The aggregate only writes to an events table, and publishes events to notify projectors.
The projector only consumes these events and updates the read tables.

Scenario:
A new read model is introduced into the system, or an existing model needs to be rebuilt.
How can this be done if aggregate/projector responsibilities are separated?

  1. Should the projector read the events table? This requires it to "know" about this table/db and it's not always feasible.
  2. Should the projector "ask" the aggregate to replay (re-publish) the events? This means the aggregate needs to implement some kind of "replay" method, and this also couples the projector and aggregate. This also means that other projectors will receive these events, and although idempotent, this seems like an unnecessary loads on the message bus.

So, what's the right way to implement a replay?

user1102018
  • 4,369
  • 6
  • 26
  • 33

2 Answers2

0

Should the projector read the events table?

Yes. Projection processes pull event histories out of durable storage (think "queues").

Transient event sharing is really only suitable for cases where you don't care about history (we're processing each event in isolation, it's not critical to the business if we miss one, etc).

Greg Young's Polyglot Data video might be a good starting point for a review (in particular, the section beginning at the 25m mark)

VoiceOfUnreason
  • 52,766
  • 5
  • 49
  • 91
  • Thanks for the reply. My understanding is that projectors consume events that the aggregate publishes to a message bus. A projector may not have access to the events _table_. So in order to reconstruct the read model, it needs the aggregate to publish the events again. – user1102018 Aug 18 '22 at 12:47
0

Elaborating on VoiceOfUnreason's answer, if there's a chance you'll ever want to replay all events (and there almost certainly is, considering that all manner of operational snafus can result in a need to replay all events after some point in time), that implies that projection processes will need access to some durable log of the events.

The journal table serving as the write model is an example of a durable log of the events (in this case represented as a table).

Another possibility could be a message bus that events are published to, if that message bus happens to meet the durability requirements (Kafka is potentially close enough (though it has some limitations to be aware of; Pulsar doesn't have the same limitations as Kafka in this area). Things that are more MQ-like probably won't be usable for this.

Any stream of the events is itself a projection of the events emitted by the aggregate when processing commands: a projection to a durable log of the events is thus perhaps the simplest projection. Accordingly, if projections might not have access to the journal table (or the journal table might aggressively purge events after snapshotting) and the message bus is not durable, then one could make the first projector one which consumes events from the message bus and writes them to a durable log for the other projectors to consume: if this projector is the only one which acks to the message bus, it can ensure that all events eventually make it to the durable log. This "ur-projector" is likely to be so simple that it doesn't need to evolve.

You can take this a step further and have a hierarchy of projectors which are effectively taking the durable log published by another projector and building another durable log with the same events. For instance, you might have a retention policy that only the most recent 90 days of events are guaranteed to be in a particular Kafka cluster, but events from the beginning of time are in object storage like S3; this can be accomplished with a projector from Kafka to S3.

Levi Ramsey
  • 18,884
  • 1
  • 16
  • 30