2

I'm happening to end up with something really weird in Spring Data JDBC (using Spring Boot 2.1 with necessary starters) aggregate handling. Let me explain that case (I'm using Lombok, the issue might be related, though)...

This is an excerpt from my entity:

import java.util.Set;
@Data
public class Person {
    @Id
    private Long id;
    ...
    private Set<Address> address;
}

This is an associated Spring Data repository:

public interface PersonsRepository extends CrudRepository<Person, Long> {
}

And this is a test, which fails:

@Autowired
private PersonsRepository personDao;
...
Person person = personDao.findById(1L).get();
Assert.assertTrue(person.getAddress().isEmpty());
person.getAddress().add(myAddress); // builder made, whatever
person = personDao.save(person);
Assert.assertEquals(1, person.getAddress().size()); // count is... 2!

Fact is that with debug I found out that the address collection (which is a Set) is containing TWO references of the same instance of the attached address. I don't see how two references end up in, and most importantly how a SET (actually a LinkedHashSet, for the record) could handle the same instance TWICE!

person  Person  (id=218)    
    address LinkedHashSet<E>  (id=228)  
        [0] Address  (id=206)   
        [1] Address  (id=206)   

Does anybody have a clue on this situation ? Thx

Thomas Escolan
  • 1,298
  • 1
  • 10
  • 28

2 Answers2

2

A (Linked)HashSet can (as a side effect) store the same instance twice when this instance has been mutated in the meantime (quote from Set):

Note: Great care must be exercised if mutable objects are used as set elements. The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set.

So here's what probably happens:

  1. You create a new instance of Address but its ID is not set (id=null).
  2. You add it to the Set, and its hash code is calculated as some value A.
  3. You call PersonsRepository.save which most likely persists the Address and sets on it some non-null ID.
  4. The PersonsRepository.save probably also calls HashSet.add to ensure that the address is in the set. But since the ID changed, the hash code is now calculcated as some value B.
  5. The hash codes A and B map to different buckets in the HashSet, and so the Address.equals method does not even get called during HashSet.add. As a result, you end up with the same instance in two different buckets.

Finally, I think your entities should rather have equals/hashCode semantics based on the ID only. To achieve it using Lombok, you'd use @EqualsAndHashCode as follows:

@Data
@EqualsAndHashCode(of = "id")
public class Person {
    @Id
    private Long id;
    ...
}

@Data
@EqualsAndHashCode(of = "id")
public class Address {
    @Id
    private Long id;
    ...
}

Still, this will not solve the problem you have because it's the ID that changes, so the hash codes will still differ.

One way of handling this would be persisting the Address before adding it to the Set.

Tomasz Linkowski
  • 4,386
  • 23
  • 38
  • Thanks for your insights. As I'm just doing experiments, I won't opt for a "production solution" :-) For the moment, I'll try to override the #save output with a call to #findById to see how it behaves. I'll add informations to the thread ! – Thomas Escolan Nov 30 '18 at 08:33
  • Actually, I chose an opposite resolution to yours : @EqualsAndHashCode(exclude = "id") did the trick; but I'm not sure this is a bright idea in the long term – Thomas Escolan Nov 30 '18 at 08:45
  • @ThomasEscolan Of course `@EqualsAndHashCode(exclude = "id")` will do the trick. But this has profound consequences, as your class is no longer an entity - it becomes a value object. You can read more about that e.g. [here](https://enterprisecraftsmanship.com/2016/01/11/entity-vs-value-object-the-ultimate-list-of-differences/). – Tomasz Linkowski Nov 30 '18 at 09:55
  • Yes, Tomasz, that's absolutely true. Besides, this was a simple case where only the ID is database-generated; it would get worst with triggers and other default values! – Thomas Escolan Nov 30 '18 at 11:44
  • 1
    Thing is, Spring Data JDBC's save operation is returning the same instance (memory pointer) that was passed-in, unlike Spring Data JPA. So you'd have to pre-save every entities (your suggestion) or systematically reload the root entity after saving. – Thomas Escolan Nov 30 '18 at 11:46
  • @ThomasEscolan I wonder if it would work without pre-saving if you replaced `Set` with `List`. After all, you won't really have that many addresses. But I understand that `Set` looks better "semantically". – Tomasz Linkowski Nov 30 '18 at 11:51
1

Tomasz Linkowski's explanation is pretty much spot on. But I'd argue for a different resolution of the problem.

What happens internally is the following: the Person entity gets saved. This might or might not create a new Person instance if Person is immutable.

Then the Address gets saved and thereby gets a new id which changes it's hashcode. Then the Address gets added to the Person since again it might be a new Address instance.

But it is the same instance yet now with a changed hashcode, which results in the single set containing the same Address twice.

What you need to do to fix this is:

Define equals and hashCode so that both are stable when saving the instance

i.e. the hashCode must not change when the instance gets saved, or by anything else done in your application.

There are multiple possible approaches.

  1. base equals and hashCode on a subset of the fields excluding the Id. Make sure that you don't edit these fields after adding the Address to the Set. You essentially have to treat it like an immutable class even if it isn't. From a DDD perspective this treats the entity as a value class.
  2. base equals and hashCode on the Id and set the Id in the constructor. From a domain perspective this treats the class as a proper entity which is identified by its ID.
Jens Schauder
  • 77,657
  • 34
  • 181
  • 348
  • Ok thx. Do you know if we have to use org.springframework.data.annotation.Immutable annotation, in order for Spring Data to provide new instances? How immutable value objects are distinguished from mutable entities, then? – Thomas Escolan Nov 30 '18 at 16:47
  • 1
    Just make it immutable, i.e no setters instead "withers" or a constructor taking all arguments. But in this context the important part is that the hashcode doesn't change. So if you make it immutable but the hashcode changes you will again have two entries in the set. – Jens Schauder Nov 30 '18 at 17:19
  • Hi Jens, when I tried to declare immutable my entities (using Lombok @Value instead of @Data), my different tests (save -insert or update-, find all, find by ID) have failed with all kind of exceptions (unsupported operation, constraint violation). – Thomas Escolan Dec 03 '18 at 16:28
  • NB: things got (a bit) better with @Immutable. Still investigating :-) – Thomas Escolan Dec 03 '18 at 20:07
  • 1
    I created an issue to investigate if we can improve the behavior https://jira.spring.io/browse/DATAJDBC-300 – Jens Schauder Dec 04 '18 at 04:57
  • I appreciate. I'm wondering of scenarios where entities would be value objects, for demonstrating to my stakeholders, but I'm not sure at all that will be valued. If you know relevant sources (seen Evans and Fowler already), please share :-) – Thomas Escolan Dec 04 '18 at 08:49