25

I was just wondering if it was a good idea to override equals and hashCode for mutable collections. This would imply that if I insert such a collection into a HashSet and then modify the collection, the HashSet would no longer be able to find the collection. Does this imply that only immutable collections should override equals and hashCode, or is this a nuisance Java programmers simply live with?

fredoverflow
  • 256,549
  • 94
  • 388
  • 662
  • 5
    the presence of *equals* and *hashCode* at the top of the OO hierarchy is a nuisance Java programmers have to live with. *equals* and *hashCode* are inherently broken but not only for the reason you mention. I mostly work in Java doing *"OO over immutable objects"* and even when doing that, *equals and hashCode* are broken. In most cases it makes no sense to have these methods: it is impossible to satisfy the *equals* and *hashCode* contract for any non-final class. This is nicely explained in *Effective Java*. – SyntaxT3rr0r Mar 09 '11 at 14:23
  • 1
    See my related question here, 10 upvotes + several favorite: http://stackoverflow.com/questions/2205565/ *equals* and *hashCode* are really broken for anything but the most simple case where you only put one type of *immutable* object (no inheritance, no nothing) in collections. You're case ain't a simple case, you'll have lots of issues. *equals* and *hashCode* at the top of the OO hierarchy are a mistake, plain and simple (but most Java programmers don't realize it). – SyntaxT3rr0r Mar 09 '11 at 14:25
  • 2
    @SyntaxT3rr0r - there are ways to handle this well in an inheritance chain, see http://www.angelikalanger.com/Articles/JavaSolutions/SecretsOfEquals/Equals.html . Additionally, entity objects are often mutable, but have some immutable "id" which is used for these methods. So no, they are not "fundamentally broken across all java". like all tools, you need to know how to use them. – jtahlborn Mar 09 '11 at 16:36
  • StringBuffer class and StringBuilder class also does not override equals() and hashCode() because of mutability. – Akhilesh Dhar Dubey May 15 '14 at 12:27

8 Answers8

5

You should override equals and hashCode if your class should act like it were a value type. This usually is not the case for collections.

(I don't really have much Java experience. This answer is based on C#.)

mgronber
  • 3,399
  • 16
  • 20
5

The problem of deep and shallow equals is bigger than Java; all object oriented languages have to concern themselves with it.

The objects that you add to the collection should override equals and hash code, but the default behavior built into the abstract implementation of the collection interface suffices for the collection itself.

duffymo
  • 305,152
  • 44
  • 369
  • 561
  • 4
    @duffymo: +1... But all OO languages didn't make the same mistake that the Java creators did: putting *equals* and *hashCode* at the top of the OO hierarchy as if it made any sense (it doesn't). There should have been an *Equalable* (just made that up) interface or somethin'. And, no, I'm not a Java hater but a Java fanboy :) – SyntaxT3rr0r Mar 09 '11 at 14:28
  • I don't agree that it's a mistake. You need identity for all objects; that's why it's in java.lang.Object. "Equalable"? I think that's a mistake, but that's just me. 8) – duffymo Mar 09 '11 at 14:51
  • 1
    @duffymo: but identity has *nothing* to do with the concept of *equals* and *hashCode*. It simply is incompatible with OO and that's it. There's a great article on the Java equals/hashCode SNAFU (a conversation between Bill Venners and the creator of Scala IIRC) and, once again, it's explained in *"Effective Java"*. It **is** broken for any non-final class and that is a fact. The concept **is** broken for mutable objects and that is also a fact. At one point you have to wonder if their presence at the top of the OO hierarchy still makes any sense or not. – SyntaxT3rr0r Mar 09 '11 at 15:10
  • @duffymo: what do you mean? you thought *equals* and *hashCode* had to do with identity? How could it? The identity issue exist in a lot of languages. Identity is a very important concept in FP languages for example (like, say, Clojure). But *equals* and *hashCode* are Java-idiosynchrasies that makes no sense with mutability and makes even less sense with non-final classes. And of course there are ways to have the "identity" concept in Java without ever using equals and hashCode... – SyntaxT3rr0r Mar 09 '11 at 16:39
  • That's enough Five Hour Energy for you, SyntaxT3rr0r. Relax. It's possible to write equals and hashCode to coincide with identity; it's recommended best practice for Hibernate and persistent objects. Your point is a good one, but the delivery I'm getting through the browser is a bit strident. – duffymo Mar 09 '11 at 16:57
  • Read the Venners article. Pretty good, but anyone who's read "Effective Java" would know about items 1-3. The mutable point is a good one; Python enforces it with their collections. Only immutable keys are allowed in dictionary. – duffymo Mar 09 '11 at 18:22
  • @Syntax: Where do I find that "great article"? Sounds interesting. – fredoverflow Mar 19 '11 at 10:56
  • @SyntaxT3rr0r: Did you see my answer? What do you think? A fundamental difficulty with `equals`/`hashCode` stems from the fact that it is very common for code to hold objects of mutable types but not change them. If the holder of an object knows that it can't be mutated outside the holder's control, that holder can benefit from a looser definition of equivalence than the holder of an object which could be mutated outside its control, but there's no way for a holder to ask for a looser definition of equivalence. – supercat Apr 17 '14 at 16:00
2

It's the same as with any mutable class. When you insert an instance into a HashSet and then call a mutating method, you will get into trouble. So, my answer is: Yes, if there's a use for it.

You can of course use an immutable Wrapper for your Collection before adding it to the HashSet.

Axel
  • 13,939
  • 5
  • 50
  • 79
2

I think the bigger question is what should happen if someone attempts to add an instance of your FredCollection to a Set twice.

FredCollection c = ...
set.add(c);
set.add(c);

Should the size() of set be 2 or 1 after this?

Will you ever have a need to test the "equality" of two different instances of FredCollection? I think the answer to this question is more important at determining your equals()/hashcode() behavior than anything else.

matt b
  • 138,234
  • 66
  • 282
  • 345
  • can we please have an illustration of working example so that it can handy for further reference. – Deepak Mar 09 '11 at 16:46
  • size() will be 1, because in general, a Set only contains one of each object. c was not assigned a new value. Even if you called c.add('cow') it would still be the same object c. The default equals() is a shallow comparison of ==. – Chloe Aug 19 '13 at 19:50
1

This is not just an issue for collections, but for mutable objects in general (another example: Point2D). And yes, it is a potential problem that Java programmers eventually learn to take into account.

Michael Borgwardt
  • 342,105
  • 78
  • 482
  • 720
1

You should not override equals and hashCode so that they reflect the mutable member.

Is more my personal point of view. I think hash code and equals are technical terms that should not be used to implement business logic. Imagine: you have two Objects (not only Collections) and ask if they are equals, then there are two different ways to answer them:

  • technical: the are equals if they represent the same object, which is different from being the same object (if you think of proxies, serilization, remote stuff...)
  • bussines logic: they are equals if the look the same (same attribute) – the important thing here is, that there is not the holy one definition of equality even to the same class in even one application. (Sample question: when are two stones equals?))

But because equals is used by technical stuff (HashMap), you should implement it in a technical way, and build the business logic related equals by something else (something like the comparator interface). And for your collection it means: do not override equals and hashCode (in a way that breaks the technical contract:

Note: great care must be exercised if mutable objects are used as map keys. The behavior of a map is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is a key in the map.

(java doc of Map) ).

Ralph
  • 118,862
  • 56
  • 287
  • 383
  • Downvote for misquoting from the Java API, which might seriously confuse people reading this. The actual, full quote is (emphasis mine): "Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, ***provided no information used in equals comparisons on the object is modified.***" So as long as the mutable member is also compared in equals, it can be used in the hashCode. Whether it is good practice to implement equals and hashCode for mutable objects is a different matter. – Medo42 Apr 18 '12 at 18:57
  • Downvote retracted, thanks. I see no problem with using equals for business logic though, as long as you are aware of the implications - sometimes you want to know if e.g. a set of people contains a certain person, even if you don't have the exact object (you might have read both from a database). Maybe the problem is that "equals" suggests a check for equal values (let's call that "business equality") when the actual idea is to check whether the two objects denote the same thing ("business identity"). – Medo42 Apr 19 '12 at 11:53
0

A fundamental difficulty with equals and hashCode is that there are two logical ways one may define an equivalence relation; some consumers of a class will want one definition, while other consumers of that same class will want another.

I would define the two equivalence relations as follows:

  • Two object references X and Y are fully equivalent if overwriting X with a reference to Y would not alter the present or future behavior of any members of X or Y.

  • Two object references X and Y have equivalent state if, in a program which has not persisted the values returned from identity-related hash function, swapping all references to X with all references to Y would leave program state unchanged.

Note that the second definition is primarily relevant in the common scenario where two things hold a references to objects of some mutable type (e.g. arrays), but can be sure that, at least within some particular time-frame of interest, those objects are not going to be exposed to anything that might mutate them. In such a scenario, if the "holder" objects are equivalent in all other regards, their equivalence should depend upon whether the objects they hold meet the second definition of equivalence above.

Note that the second definition does not concern itself with any details of how an object's state might change. Note further that immutable objects could, for either definition of equivalence, report distinct objects with equal content as equal or unequal (if the only way in which X and Y differ is that X.Equals(X) reports true while X.Equals(Y) reports false, that would be a difference, but it would probably be most useful to have such objects use reference identity for the first equivalence relation and equivalence of other aspects for the second.

Unfortunately, because Java only provides one pair of equivalence-defining classes, a class designer must guess which definition of equivalence will be most relevant to consumers of the class. While there's a substantial argument to be made in favor of using the first always, the second is often more practically useful. The biggest problem with the second is that there's no way a class can know when code using the class will want the first equivalence relation.

supercat
  • 77,689
  • 9
  • 166
  • 211
-1

equals is used to add/remove elements from collections like CopyOnWriteArraySet, HashSet if hashCode is equal for two different objects, etc. equals need to be symmetric i.e. if B.equals(C) returns true then C.equals(B) should return the same result. Otherwise your add/remove on those XXXSets behave in a confusing manner. Check Overriding equals for CopyOnWriteArraySet.add and remove for how improper overriding of equals affected add/remove operations on collections

Community
  • 1
  • 1
yalkris
  • 2,596
  • 5
  • 31
  • 51