0

From "Apress - Beginning Hibernate From Novice to Professional" p. 161, where they explain the bag collection:

If the elements lack a proper key, there will be a performance impact that will manifest itself when update or delete operations are performed on the contents of the bag.

  1. What do they mean by a proper key?

  2. Why will there be a performance impact in the case of update or delete operations performed on the bag elements?

rapt
  • 11,810
  • 35
  • 103
  • 145

2 Answers2

0

They mean a primary key on the underlying db table. Performance will suffer due to there being no key and therefore table scans will be needed vs index seeks when a key is present

Charleh
  • 13,749
  • 3
  • 37
  • 57
  • That's true to any table that does not contain a key. Why would they explicitly specify it when discussing bags? – rapt Jul 08 '12 at 23:34
0

Say you have an Entity Parent and it has a collection of Children. Without using an indexed column on a list, Hibernate will use "Bag Semantics" for handling the collection of children. This means that the collection is unordered, and can contain duplicates. If you watch your SQL log when delete a child, you will see a delete statement deleting all children. Followed by # of children - 1 inserts that re-inserts all of the undeleted children. Why not just a single delete statement?

See this link for a full explanation (http://assarconsulting.blogspot.com/2009/08/why-hibernate-does-delete-all-then-re.html).

Clearly, it would be more efficient for a single delete statement, right? In most cases, we actually want a Set, as our entities are typically unique. However, a lot of developers still use List (out of habit). By default, for a list without an indexed column, hibernate will use Bag Semantics, giving the worse performance.

jeff
  • 4,325
  • 16
  • 27
  • :-) You say what apparently happens: bad performance due to deleting all children & then re-inserting them - but you didn't say why doing it this way. I've also read the link - the guy also tells what happens, but not why. He says "We asked earlier, why not just issue one delete statement. If it stopped here we would end up with (parent has no children)". Yes, but why deleting all children in the 1st place? Delete one of the children that `equals` the child that is indexed 0 in the java collection. Also, unlike you he says only equal children are deleted, not all children (his image disagrees) – rapt Jul 09 '12 at 02:03
  • You can't use equals b/c a bag can contain duplicates - equals could result in multiple delete's - so to play it safe hibernate deletes all children - then reinserts all objects that are still left in your Java collection (in memory) – jeff Jul 09 '12 at 12:15
  • Think about this. A bag can contain duplicates, ie elements that return `true` when compared by `equals`. Say I have duplicates in my bag & I remove one from the Java bag. In commit, we can simply delete one (not all) of the duplicates, no matter which - they're all the same to `equals`. What's wrong with that? Also, if the elements in the bag are entities (ie have unique id's), you don't have to delete the element based on its complete list of properties & FK to parent, just by its id. Also, hope you see that in the link they talk about deleting all duplicates, not all children - less painful – rapt Jul 09 '12 at 13:45
  • From your original post "If the elements lack a proper key".... So how would you delete just one using SQL? Limit by row number? They are unordered so SQL will give you no guarantees on which record gets deleted. As far as deleting all duplicates vs deleting all children - see what happens when you delete a child using Bag Semantics in hibernate. If you are still curious after this, I invite you to continue reading your book/conducting research. – jeff Jul 09 '12 at 14:02
  • Tested it: all childern are deleted (eg guy in link is confused). --- I'd delete just 1 element by "DELETE TOP 1 FROM .. WHERE .." or dialect-equivalent. So it is feasible to remove 1 child with 1 DELETE even with an unindexed bag. I'm saying, if I only remove a child from bag, I can avoid delete-all-reinsert. Correct me if I'm wrong. --- But I see now why hibernate is more aggressive, it's b/c things can be more complicated. eg I can update, or update+delete. Then the changed java element isn't anymore in the db, so I can't locate it by sql. So I need to delete and recreate the bag in the db. – rapt Jul 09 '12 at 17:01
  • BUT, I c same problem with non-id set (component set). U said that Set would b more efficient. WHY? --- Also, u say "By default, for a list without an indexed column, hibernate will use Bag Semantics". This is not by default - u have no other choice w/o index. --- Also, u say I'd need to use indexed bag to avoid delete-all-reinsert. That's true for doing only update. But if I remove element from bag: I can DELETE only that specific element from db, but I should update/reinsert all elements of higher indexes (decrement). So indexed bag does not let me remove with one DELETE. Correct me if wrong – rapt Jul 09 '12 at 17:02