Inter-aggregate references must use primary keys?

Question

When I was reading Microservice Patterns, one of the paragraph says that Domain-Driven Design requires aggregate to follow some rules. One of the rule is "inter-aggregate references must use primary keys".

For example, it basically means that a class Book may only have getOwnerUserId() and shouldn't have getOwnerUser().

However, in Eric Evans's Domain-Driven Design, it clearly says:

Objects within the AGGREGATE can hold references to other AGGREGATE roots.

I guess it means that Book can have getOwnerUser().

If my above understandings of these 2 books are correct, is the book "Microservice Patterns" wrong about aggregates? Or is there some variant of Domain-Driven Design that "Microservice Patterns" is referring to? Or, did I miss something?

score 1 · Answer 1 · answered Jun 29 '21 at 17:18

Both books are saying roughly the same thing using different words. I'll add mine.

An aggregate can hold a reference to other aggregates in the same bounded context. This reference is through an identifier. In many cases an identifier is a primary key (relational artifact) or a document ID (e.g. from a document database like MongoDB). Regardless, in the domain, it's just an "identifier".

It is also possible for aggregates to refer to aggregates in another bounded context. In this case the reference is not just an identifier, but a projection of the "foreign" aggregate into the current bounded context.

Think of a library system. One bounded context could be the checkout system, and another could be about books themselves. A Library Patron aggregate could have references to books within its aggregate; these references would be small objects containing just a few of the books' properties: ID, title, and author perhaps, but not the number of pages, publisher, location in the library, etc.

Levi Ramsey · Answer 2 · 2021-06-27T13:32:42.290

0

"Aggregate root" is essentially the DDD way of saying "primary key" (I suspect the reason for not saying "primary key" is that to do so would be bringing something that's more of an infrastructure concern into the domain).

If User is a separate aggregate from Book, Book can only hold a User's ID (assuming that that's the aggregate root for User), not a User.

Since anything outside of the User class can only access a user by ID, however, it's probably better naming to say getUser() vs. getUserId() and have getUser() return a user ID.

edited Jun 27 '21 at 13:32

answered Jun 26 '21 at 19:23

Levi Ramsey

18,884
1
16
30

Thanks for the answer. However, I don't agree that "aggregate root is the DDD way of saying primary key". In the book "Domain-Driver Design", lots of entities are often described as "aggregate root". For example, a sentence in the book says "Cargo is also an obvious AGGREGATE root". It doesn't sound right to interpret it as "Cargo is also an obvious primary key" since "Cargo" is the whole class, not just the identity. – johnlinp Jun 27 '21 at 02:18
1

A whole class can be an aggregate root. But an aggregate root must be one member of an aggregate, see the definition of "aggregate" in the Evans book: "External references (to the aggregate) are restricted to one member of the aggregate, designated as the root" – Levi Ramsey Jun 27 '21 at 13:01
In the paragraph before "Cargo is also an obvious AGGREGATE root" there is "But at some point we'll probably want to drop the collection in favor of a database lookup with *Cargo* as the key". The subsequent Figure 7.3 also defines *Cargo* as a globally unique identifier. – Levi Ramsey Jun 27 '21 at 13:06
Let me paste the complete definition of aggregate in the DDD book here: "AGGREGATE, A cluster of associated objects that are treated as a unit for the purpose of data changes. External references are restricted to one member of the AGGREGATE, designated as the “root”, a set of consistency rules applies within the AGGREGATE’S boundaries." – johnlinp Jun 27 '21 at 14:15
Can we say that the term "aggregate root" sometimes refer to the whole class, and sometimes it only refer to the primary key? – johnlinp Jun 27 '21 at 14:18
Since there's not really any such thing as a class (beyond being one of many possible implementation techniques) in DDD, I'm not sure that's really accurate. That said, in the interest of brevity, it's not uncommon to perhaps abuse terminology when implementing a DDD using classes to have a class named `Cargo` represent the aggregate. In that situation though, other aggregates would refer to an instance of `Cargo` by its identifier rather than holding a language-level reference/pointer to a `Cargo`. – Levi Ramsey Jun 27 '21 at 16:30
I see. Thanks for the explanation. However, I still have a question: you said that you suspect the reason for not saying "primary key" is that "to do so would be bringing something that's more of an infrastructure concern into the domain". But DDD has the concept of "identity". Why not just say "identity" instead? – johnlinp Jun 29 '21 at 05:51
My speculation would be that because there's the idea of global identity (for entities which can be aggregate roots) and local identity (within an aggregate, has no real meaning outside of the aggregate), and saying "identity of the aggregate root" is more verbose than just saying "aggregate root". – Levi Ramsey Jun 29 '21 at 14:05
"Aggregate root" is essentially the DDD way of saying "primary key". I strongly disagree, this statement makes no sense at all: an Aggregate Root is the root entity defining a transactional/consistency boundary while a "primary key" is just a value identifier in the context of a RDBMS. They are nothing alike. – plalx Jul 01 '21 at 01:34

plalx · Answer 3 · 2021-07-01T02:40:31.807

"inter-aggregate references must use primary keys"

"primary key" is very RDBMS-specific so identity would be more appropriate.

"Objects within the AGGREGATE can hold references to other AGGREGATE roots."

Can, but generally shouldn't.

Why reference through identity?

An Aggregate Root (AR) is a strong consistency boundary. The natural way for an AR to protect it's invariants (including from violations through concurrency) is to encapsulate it's data in a way that allows it to oversee/detect every change.

When you reference other ARs by object reference rather than identity the consistency boundary becomes blurry which makes the design much harder to reason about.

Here's a (rather silly) example:

We can see that it's not enough anymore to look at the AR's structure to know what's truly part of it's boundary and surely that could lead to issues.

Furthermore, would you know if persons will get deleted if you delete InviteList or if changes made to persons from within InviteList would get persisted when calling save(inviteList)? You'd have to inspect the persistence mappings (assuming an ORM) and the cascade options to know for sure.

Why have direct references?

I'd say the primary reason to allow a direct reference to another AR would be to be pragmatic about queries that are constructed from domain objects. It's generally harder to query without such relationships (e.g. find all InviteList that have an invitee named "Foo") or construct DTOs that must aggregate data from multiple ARs (e.g. InviteListDto with all the invitee names).

However, that's also one of the many reasons CQRS have become so popular these days. If you bypass the domain model for queries entirely (e.g. plain SQL) then you do not have to make concessions in your domain for querying needs.

References

Here's a sample from the IDDD book by Vaugh Vernon where he talks about that very quote from Evans.

Inter-aggregate references must use primary keys?

3 Answers3

Why reference through identity?

Why have direct references?

References