1

I am working with Spring Data JPA and Entity Graphs. I have the following Entity structure: Result entity has a list of SingleQuestionResponse entities, and the SingleQuestionResponse entity has a set of Answer entities (markedAnswers).

public class Result {
...

 @OneToMany(cascade = CascadeType.PERSIST)
 @JoinColumn(name = "result_id", nullable = false)
 private List<SingleQuestionResponse> responses;

...

}
public class SingleQuestionResponse {
...

@ManyToMany(fetch = FetchType.LAZY)
@JoinTable(
        name = "singlequestionresponses_answers",
        joinColumns = @JoinColumn(name = "single_question_response_id"),
        inverseJoinColumns = @JoinColumn(name = "answer_id")
)
private Set<Answer> markedAnswers;

...
}

and Answer just has simple-type fields.

Now, I would like to be able to fetch Result, along with all responses, and the markedAnswers in one query. For that I annotated the Result class with:

@NamedEntityGraph(name = "graph.Result.responsesWithQuestionsAndAnswersEager",
        attributeNodes = @NamedAttributeNode(value = "responses", subgraph = "responsesWithMarkedAnswersAndQuestion"),
        subgraphs = {
                @NamedSubgraph(name = "responsesWithMarkedAnswersAndQuestion", attributeNodes = {
                        @NamedAttributeNode("markedAnswers"),
                        @NamedAttributeNode("question")
                })
        }
)

an example of usage is:

@EntityGraph("graph.Result.responsesWithQuestionsAndAnswersEager")
List<Result> findResultsByResultSetId(Long resultSetId);

I noticed, that calling the findResultsByResultSetId method (and other methods using this entity graph) results in responses (SingleQuestionResponse entities) being multiplied by the number of markedAnswers. What I mean by that is that result.getResponses() returns more SingleQuestionResponse objects than it should (it returns one response object per each markedAnswer).

I realize this is due to Hibernate making a Cartesian product with the join, but I have no idea how to fix it.

Can you help please? Thanks

furry12
  • 902
  • 1
  • 14
  • 31

1 Answers1

0

You have to use the DISTINCT operator. With Spring Data JPA, this can be done by naming the method findDistinctResultsByResultSetId

Christian Beikov
  • 15,141
  • 2
  • 32
  • 58
  • I think that would distinct just root entity `Result` while OP issue regards the "middle" relationship, being `SingleQuestionResponse` collection stored in a `List`. Is there any way to instruct Hibernate to deduplicate that one without turning collection in a `Set`? – 4javier Feb 03 '22 at 18:13
  • Distinct is by default passed through to SQL so it will usually also deduplicate the elements in the list. – Christian Beikov Feb 04 '22 at 09:52
  • True. But hibernate will pass a `distinct` combining all selected columns, so every row will result different from the other ones. Tested copy/pasting produced query in mysql cli. Another proof of the fact that the "root distinction" happens in-memory, is that it generates exactly the same result if `HINT_PASS_DISTINCT_THROUGH` is passed. At the moment, result set generated by DB will always be cartesian product of all the three joined tables, and Hibernate in memory will deduplicate just the first one. – 4javier Feb 04 '22 at 12:28
  • If multiple collections are join fetched, then yes, using distinct won't help. In that case you will have to use a `Set` everywhere or use `SELECT` or `SUBSELECT` fetching – Christian Beikov Feb 04 '22 at 15:12