19

In a CQRS event store, does an "aggregate" contain a summarized view of the events or simply a reference to the boundary of those events? (group id)

A projection is a view or representation of events so in the case of an aggregate representing a boundary that would make sense to me whereas if the aggregate contained the current summarized state I'd be confused about duplication between the two.

Ruben Bartelink
  • 59,778
  • 26
  • 187
  • 249
webish
  • 701
  • 2
  • 9
  • 17
  • An aggregate raise zero or more events after it executes it's commands. The store contains all the events raised by all the aggregates from the begging. The events can be grouped logically by the aggregate id into a stream. Please be more clear about your question. – Constantin Galbenu Jan 30 '17 at 13:40
  • 1
    I'm trying to figure out if an "aggregate", with respect to CQRS and event sourcing as often described by Greg Young, is a summarized state of the events making up an entity. For example, if the events making up an Order/Aggregate are summarized to a single row w/data (order total, etc) or there is simply an aggregate id to definitively link the events that could be summarized and represent current state of an Order. It seems like the latter since reads are performed from a projected store that maintains state but I'm unsure if I'm missing something. – webish Jan 30 '17 at 15:13
  • 1
    Events are used in two places: 1. the command side, by the aggregate to build up its state and use that state to process future commands and 2. the query side, by the read models, to build a state that is shown to the user within the UI. There could be more read models that process the same events, depending on what the user need to see on the screen. – Constantin Galbenu Jan 30 '17 at 15:53
  • Ok, then that goes against what my intuition was telling me. So it's possible that the same data structure might exist in the aggregate and query side store (duplicated view used for different purposes). Does the aggregate persist a serialized version like events or a normalized set? – webish Jan 30 '17 at 15:55
  • I recommend to you this site, if you want to understand more: http://cqrs.nu – Constantin Galbenu Jan 30 '17 at 15:57
  • There is no single data structure. The events are persisted in the event store, as a serialized string, for example. Then, the aggregate's state (the Aggregate is an object) is builded from these events and that state is used only by the Aggregate, it can't be queried. On the other hand, every read model has a distinct data structure, perfectly constructed and optimized (denormalized) for use by the UI. – Constantin Galbenu Jan 30 '17 at 16:02
  • Thanks @ConstantinGALBENU, so it seems that the aggregate is computed using the entire set of events unless it gets large enough to warrant a snapshot. Gaining some clarity though the use-case for an aggregate at all seems odd. I'm having trouble identifying the value. I'll take a closer look through cqrs.nu. – webish Jan 30 '17 at 16:11
  • @ConstantinGALBENU cqrs.nu whilst being a useful site, contains many dogmatic statements that have no real ground. So I would say - yes, it useful, but challenge what they write there in your head. – Alexey Zimarev Jan 31 '17 at 21:13
  • Give me an example of what to be challenged? – Constantin Galbenu Feb 01 '17 at 07:31

1 Answers1

32

In a CQRS event store, does an "aggregate" contain a summarized view of the events or simply a reference to the boundary of those events? (group id)

Aggregates don't exist in the event store

  • Events live in the event store
  • Aggregates live in the write model (the C of CQRS)

Aggregate, in this case, still has the same basic meaning that it had in the "Blue Book"; it's the term for a boundary around one or more entities that are immediately consistent with each other. The responsibility of the aggregate is to ensure that writes (commands) to the book of record respect the business invariant.

It's typically convenient, in an event store, to organize the events into "streams"; if you imagine a RDBMS schema, the stream id will just be some identifier that says "these events are all part of the same history."

It will usually be the case that one aggregate -> one stream, but usually isn't always; there are some exceptional cases you may need to handle when you change your model. Greg Young is covering some of these in his new eBook on event versioning.

So it's possible that the same data structure might exist in the aggregate and query side store (duplicated view used for different purposes).

Yes, and no. It's absolutely the case that the data structures used when validating a write match those used to support a query. But the storage doesn't usually match. Put another way, aggregates don't get stored (the state of the aggregate does); whereas it is fairly common that the query view gets cached (again, not the data structure itself, but a representation that can be used to repopulate the data structure without necessarily needing to replay all of the events).

Any chance you have an example of aggregate state data structure (rdbms)? Every example I've found is trimmed down to a few columns with something like include id, source_id, version making it difficult to visualize what the scope of an aggregate is

A common example would be that of a trading book (an aggregate responsible for matching "buy" and "sell" orders).

In a traditional RDBMS store, that would probably look like a row in a books table, with a unique id for the book, information about what item that book is tracking, date information about when that book is active, and so on. In addition, there's likely to be some sort of orders table, with uniq ids, a trading book id, order type, transaction numbers, prices and volumes (in other words, all of the information the aggregate needs to know to satisfy its invariant).

In a document store, you'd see all of that information in a single document -- perhaps a json document with the information about the root object, and two lists of order objects (one for buys, one for sells).

In an event store, you'd see the individual OrderPlaced, TradeOccurred, OrderCancelled.

it seems that the aggregate is computed using the entire set of events unless it gets large enough to warrant a snapshot.

Yes, that's exactly right. If you are familiar with a "fold function", then event sourcing is just a fold from some common initial state. When a snapshot is available, we'll fold from that state (with a corresponding reduction in the number of events that get folded in)

In an event sourced environment with "snapshots", you might see a combination of the event store and the document store (where the document would include additional meta information indicating where in the event stream it had been assembled).

VoiceOfUnreason
  • 52,766
  • 5
  • 49
  • 91
  • Any chance you have an example of aggregate state data structure (rdbms)? Every example I've found is trimmed down to a few columns with something like include _id_, _source_id_, _version_ making it difficult to visualize what the scope of an aggregate is. – webish Jan 30 '17 at 16:40
  • The data structure of the state of the Aggregate is unique for every aggregate class. If the Aggregate's state is persisted in a snapshot, as an optimization, it could be simply serialized as a string, using the default serializer of the programming language (in PHP for example, use the `serialize()` function) so the data structure is irrelevant from outside the aggregate, as you don't query it. – Constantin Galbenu Jan 31 '17 at 07:26
  • @VoiceOfUnreason Nice answer :) – Constantin Galbenu Jan 31 '17 at 07:26
  • At the end... an aggregate is "also" a projection (left-fold of the events, starting in a know initial state) but handled by the write model instead. Isn't it? For example: Aggregate is a product with a title. Say marketing people is insterested in knowing "how many times the title changed after creation". Events are ProductCreated and ProductTitleEdited. See sequence in the next comment: – Xavi Montero Jun 24 '19 at 12:43
  • The "aggreate projector" starts by replaying the stream event. If edition comes before creation, throw exception (this should never have happened). Assuming creation comes first, on creation, left fold => `aggregate.title = event.title`, on edition left fold => `aggregate.title = event.title`. Instead in the "marketing projector", on creation left fold => `counter = 0`, on edition left fold `counter = counter + 1`. Isn't the aggregate also a projection? The only difference is that the "interested in reading" is the write model for validation further commands or rejecting. But it's same logic. – Xavi Montero Jun 24 '19 at 12:46