DDD: do I really need to load all objects in an aggregate? (Performance concerns)

Question

In DDD, a repository loads an entire aggregate - we either load all of it or none of it. This also means that should avoid lazy loading.

My concern is performance-wise. What if this results in loading into memory thousands of objects? For example, an aggregate for Customer comes back with ten thousand Orders.

In this sort of cases, could it mean that I need to redesign and re-think my aggregates? Does DDD offer suggestions regarding this issue?

If your `Customer` aggregate contains ten thousand `Order` objects, chances are `Order` is an aggregate on its own. — theDmi, Jun 06 '16 at 16:38
[This answer of mine](http://stackoverflow.com/a/31585706/219187) summarizes the primary design drivers for aggregates. — theDmi, Jun 06 '16 at 16:43
As I understand the concept, the basic rule is that for read operation, you don't need to load whole aggregate. For write operation, you should load the whole aggregate because of transaction/integrity and validation purpose . — Muflix, Dec 30 '19 at 14:22
@Muflix read operations can sometimes contain calculation result that are generated by the aggregate. E.g. for a Customer display that has Discount eligibility as its property. I mean yes you can do the calculation outside the aggregate, but that defeats the purpose of DDD. — james, Apr 29 '21 at 01:44
@James If you need to display some calculations you have in aggregate then yes, you have to load it. But not all read calculations need to be in aggregate and probably it should not be (this is tricky part I am not sure). For complex queries I just run stored procedure with optimized SQL query, but you are right, that if your read operation contains business logic, it is good question where to put it. — Muflix, May 05 '21 at 10:41

mgonzalezbaile · Accepted Answer · 2016-06-08T21:56:06.623

Take a look at this Effective Aggregate Design series of three articles from Vernon. I found them quite useful to understand when and how you can design smaller aggregates rather than a large-cluster aggregate.

EDIT

I would like to give a couple of examples to improve my previous answer, feel free to share your thoughts about them.

First, a quick definition about an Aggregate (took from Patterns, Principles and Practices of Domain Driven Design book by Scott Millet)

Entities and Value Objects collaborate to form complex relationships that meet invariants within the domain model. When dealing with large interconnected associations of objects, it is often difficult to ensure consistency and concurrency when performing actions against domain objects. Domain-Driven Design has the Aggregate pattern to ensure consistency and to define transactional concurrency boundaries for object graphs. Large models are split by invariants and grouped into aggregates of entities and value objects that are treated as conceptual whole.

Let's go with an example to see the definition in practice.

Simple Example

The first example shows how defining an Aggregate Root helps to ensure consistency when performing actions against domain objects.

Given the next business rule:

Winning auction bids must always be placed before the auction ends. If a winning bid is placed after an auction ends, the domain is in an invalid state because an invariant has been broken and the model has failed to correctly apply domain rules.

Here there is an aggregate consisting of Auction and Bids where the Auction is the Aggregate Root.

If we say that Bid is also a separated Aggregate Root you would have have a BidsRepository, and you could easily do:

var newBid = new Bid(money);
BidsRepository->save(auctionId, newBid);

And you were saving a Bid without passing the defined business rule. However, having the Auction as the only Aggregate Root you are enforcing your design because you need to do something like:

var newBid = new Bid(money);
auction.placeBid(newBid);
auctionRepository.save(auction);

Therefore, you can check your invariant within the method placeBid and nobody can skip it if they want to place a new Bid.

Here it is pretty clear that the state of a Bid depends on the state of an Auction.

Complex Example

Back to your example of Orders being associated to a Customer, looks like there are not invariants that make us define a huge aggregate consisting of a Customer and all her Orders, we can just keep the relation between both entities thru an identifier reference. By doing this, we avoid loading all the Orders when fetching a Customer as well as we mitigate concurrency problems.

But, say that now business defines the next invariant:

We want to provide Customers with a pocket so they can charge it with money to buy products. Therefore, if a Customer now wants to buy a product, it needs to have enough money to do it.

Said so, pocket is a VO inside the Customer Aggregate Root. It seems now that having two separated Aggregate Roots, one for Customer and another one for Order is not the best to satisfy the new invariant because we could save a new order without checking the rule. Looks like we are forced to consider Customer as the root. That is going to affect our performance, scalaibility and concurrency issues, etc.

Solution? Eventual Consistency. What if we allow the customer to buy the product? that is, having an Aggregate Root for Orders so we create the order and save it:

var newOrder = new Order(customerId, ...);
orderRepository.save(newOrder);

we publish an event when the order is created and then we check asynchronously if the customer has enough funds:

class OrderWasCreatedListener:
    var customer = customerRepository.findOfId(event.customerId);
    var order = orderRepository.findOfId(event.orderId);
    customer.placeOrder(order); //Check business rules
    customerRepository.save(customer);

If everything was good, we have satisfied our invariants while keeping our design as we wanted at the beginning modifying just one Aggregate Root per request. Otherwise, we will send an email to the customer telling her about the insufficient funds issue. We can take advance of it by adding to the email alternatives options she can purchase with her current budget as well as encourage her to charge the pocket.

Take into account that the UI can help us to avoid having customers paying without enough money, but we cannot blindly trust on the UI.

Hope you find both examples useful, and let me know if you find better solutions for the exposed scenarios :-)

Thank you for your effort in explaining this to me. I was compelled to award you the answer. Thanks! — user11081980, Jun 15 '16 at 17:50
This is a good example. I currently have an aggregate that has 2 small children that require a lot of validation but are not collections, so it's a no-brainer. However, as you scale out you have these tricky design choices where you must decide whether you want to use domain events (slightly less obvious, bit harder to test) or tack on another child (bloat the aggregate more, slow down your writes). — perustaja, Jun 02 '20 at 21:00

score 23 · Answer 2 · answered Jun 06 '16 at 17:44

23

In this sort of cases, could it mean that I need to redesign and re-think my aggregates?

Almost certainly.

The driver for aggregate design isn't structure, but behavior. We don't care that "a user has thousands of orders". What we care about are what pieces of state need to be checked when you try to process a change - what data do you need to load to know if a change is valid.

Typically, you'll come to realize that changing an order doesn't (or shouldn't) depend on the state of other orders in the system, which is a good indication that two different orders should not be part of the same aggregate.

answered Jun 06 '16 at 17:44

VoiceOfUnreason

52,766
5
49
91

3

How does your answer change if there's an invariant saying that a Customer can't have more than `n` orders in their lifetime? Orders still don't depend on each other, but you must load them all (or their references?) in order to have how many orders the customer has... – ishegg May 17 '19 at 22:44
2

@ishegg, I see two solutions here: Domain Services, one service would take care of instantiating the new order with that logic, and Domain Events, as mgonzalezbaile explains – javier_domenech Jun 09 '19 at 13:05
@javier_domenech first solution - domain service - when orders are aggregate roots itself, you can't get away without having an index on database – tchelidze Nov 12 '20 at 10:30
@VoiceOfUnreason There might be different behavior associated with an aggregate each of those behavior requiring different data. – Ced Jan 24 '23 at 20:02

DDD: do I really need to load all objects in an aggregate? (Performance concerns)

2 Answers2

Linked