0

I have an aggregate root named Account and an entity named Contact that can be accessed through a method on the root: Account.GetContactById(string id). Access to the aggregate root is through a repository, so data access logic to get Accounts from storage resides there.

Where should the data access logic for accessing the Contact entity reside? Most examples I see would show the Account.GetContactById method searching an in-memory collection. In my case, an Account can reference thousands of Contacts which I would not want to prefetch into memory. So, given that access to data storage will be required when the method is called, do I implement that access in:

  1. The Account.GetContactById method? That would spread direct access to storage outside of repositories and introduce some tight coupling.
  2. The AccountRepository, so it can be called by the Account aggregate? That would seem to expose Contact entities directly to any other user of the repository, which violates Evans' rules.
  3. Another repository, such as ContactRepository? In that case I have a repository for an entity that is not an aggregate root.
  4. Other?
BitMask777
  • 2,543
  • 26
  • 36
  • 2
    Why is `Contact` not an aggregate root of it's own? Why did you decide to go with a large cluster `Account` aggregate? – plalx May 14 '15 at 23:32
  • @plalx Excellent point. My dilemma might just be an artifact of a bad modeling decision. I'll think on this a bit more. – BitMask777 May 15 '15 at 20:09
  • Usually, if there are no invariants to enforce such as a maximum number of contacts, etc then `Contact` would probably be an AR since you would gain nothing by clustering them within `Account`. – plalx May 15 '15 at 20:25
  • @plalx Your advice led me down the right solution path. I'd like to give you some cred for this, so if you want to post it as an answer I'll accept it and add comments to it with my solution specifics. – BitMask777 May 18 '15 at 17:05

1 Answers1

0

The comments from @plalx pointed me in the right direction. I'm posting my resolution here as the answer to help others who might have this same type of question.

After reading a couple of really good articles by Vernon about modeling aggregates (you can find the articles here and here) I came to the conclusion that I was letting compositional structure drive me toward a bad model. Just because Contact relates to Account is not enough to put them in the same aggregate. From Vernon:

designing aggregates is more about consistency boundaries. A reference between two aggregates does not mean they are in the same consistency boundary, and therefore a single aggregate with one as the root.

He explains how this doesn't scale, in part because of the very issue I was hitting, using a sprint tracking system as an example:

Keeping performance and scalability in mind, what happens when one user of one tenant wants to add a single backlog item to a product, one that is years old and already has thousands of backlog items? Assume a persistence mechanism capable of lazy loading (Hibernate). We almost never load all backlog items, releases, and sprints all at once. Still, thousands of backlog items would be loaded into memory just to add one new element to the already large collection.

He makes some helpful comments about how single entity aggregates with value types are to be preferred when possible, instead of the large-cluster multi-entity aggregate model I was building, and that separate aggregates should reference one another by identifiers only. He suggests application services as a means to resolve associations between aggregates.

So, in the end, I separated Contact and Account into different aggregates, with an application service to resolve the association.

BitMask777
  • 2,543
  • 26
  • 36