How does persistence ignorance work with references to (non-root) aggregates?

Question

We have several aggregate roots that have two primary means of identification:

an integer "key", which is used as a primary key in the database (and is used as a foreign key by referencing aggregates), and internally within the application, and is not accessible by the public web API.
a string-based "id", which also uniquely identifies the aggregate root and is accessible by the public web API.

There are several reasons for having an integer-based private identifiers and a string-based public identifier - for example, the database performs better (8-byte integers as opposed to variable-length strings) and the public identifiers are difficult to guess.

However, the classes internally reference each other using the integer-based identifiers and if an integer-based identifier is 0, this signifies that the object hasn't yet been stored to the database. This creates a problem, in that entities are not able to reference other aggregate roots until after they have been saved.

How does one get around this problem, or is there a flaw in my understanding of persistence ignorance?

EDIT regarding string-based identifiers

The string-based identifiers are generated by the repository, connected to a PostgreSQL database, which generates the identifier to ensure that it does not clash with anything currently in the database. For example:

class Customer {
    public function __construct($customerKey, $customerId, $name) {
        $this->customerKey = $customerKey;
        $this->customerId = $customerId;
        $this->name = $name;
    }
}

function test(Repository $repository, UnitOfWork $unitOfWork) {
    $customer = new Customer(0, $repository->generateCustomerId(), "John Doe");
    // $customer->customerKey == 0
    $unitOfWork->saveCustomer($customer);
    // $customer->customerKey != 0
}

I assume that the same concept could be used to create an entity with an integer-based key of non-0, and the Unit of Work could use the fact that it doesn't exist in the database as a reason to INSERT rather than UPDATE. The test() function above would then become:

function test(Repository $repository, UnitOfWork $unitOfWork) {
    $customer = new Customer($repository->generateCustomerKey(), $repository->generateCustomerId(), "John Doe");
    // $customer->customerKey != 0
    $unitOfWork->saveCustomer($customer);
    // $customer->customerKey still != 0
}

However, given the above, errors may occur if the Unit of Work does not save the database objects in the correct order. Is the way to get around this to ensure that the Unit of Work saves entities in the correct order?

I hope the above edit clarifies my situation.

"in that entities are not able to reference other aggregate roots until after they have been saved" - I could interpret that a couple different ways, it's possible you are violating the boundaries of your aggregate roots? Also, why not model your relationships as references - then your object graph is valid regardless of whether or not id is saved. See also: https://lostechies.com/jimmybogard/2008/05/21/entities-value-objects-aggregates-and-roots/ — Nathan, Sep 07 '15 at 07:02
By "why not model your relationships as references", do you mean (C-style pointer-type) references? E.g., rather than storing the integer-based key, storing a pointer/reference to the object? If that's the case, then loading a single aggregate would mean loading a significant portion of the "object graph", which is not practical for performance reasons and blurs the lines of ownership. — magnus, Sep 07 '15 at 22:54

score 2 · Answer 1 · answered Sep 07 '15 at 09:58

2

It's a good approach to look at Aggregates as consistency boundaries. In other words, two different aggregates have separate lifecycles and you should refrain from tying their fates together inside the same transaction. From that axiom you can safely state that no aggregate A will ever have an ID of 0 when looked at from another aggregate B's perspective, because either the transaction that creates A has not finished yet and it is not visible by B, or it has completed and A has an ID.

Regarding the double identity, I'd rather have the string ID generated by the language than the database because I suppose coming up with a unique ID would imply a transaction, possibly across multiple tables. Languages can usually generate unique strings with a good entropy.

answered Sep 07 '15 at 09:58

guillaume31

13,738
1
32
51

This answer makes a lot of sense according to how I understand DDD, but the current job I'm working on involves the bulk creation and update of a significant number of aggregates and relationships based on a large data file. There are two phases - validation and submission. The validation rules requires relationships to be in place, but if the data is invalid, the update will not proceed. Because aggregate relationships depend on non-0 keys, and non-0 keys depend on database persistence, I don't know how I can "refrain from tying their fates together inside the same transaction". – magnus Sep 07 '15 at 22:44
Wouldn't intermediary `Save()'s` along the course of your UoW solve the problem ? – guillaume31 Sep 08 '15 at 09:42
1

@user1420752 If you explained the business problem rather than constraining yourself to a specific technical approach perhaps we could help you better. – plalx Sep 08 '15 at 13:58

How does persistence ignorance work with references to (non-root) aggregates?

1 Answers1