5

Bloch's wonderful book "Effective Java" points out that if equals is not symmetric then the behavior of Collections contains is indeterminate.

In the example he gives (reproduced with small modifications below), Bloch says that he sees a "false", but could just as well have seen a true or an Exception.

You could see a "true" if the standard does not specify whether contains(Object o) checks e.equals(o) or o.equals(e) for each item in the collection, and the former is implemented. However, the Collections Javadoc clearly states that it has to be the latter (and it's what I observed).

So the only possibilities I see are "false" or possibly an exception (but the String Javadoc seems to preclude the latter).

I understand the broader point, it's likely that an asymmetric equals is will lead to problems in code outside of Collections, but I don't see it for the example he quotes.

Am I missing something?

import java.util.List;
import java.util.ArrayList;

class CIString {
  private final String s;

  public CIString(String s) {
    this.s = s;
  }

  @Override public boolean equals( Object o ) {
    System.out.println("Calling CIString.equals from " + this.s );
    if ( o instanceof CIString) 
      return s.equalsIgnoreCase( ( (CIString) o).s);
    if ( o instanceof String) 
      return s.equalsIgnoreCase( (String) o );
    return false;
  }
  // Always override hashCode when you override equals
  // This is an awful hash function (everything collides -> performance is terrible!)
  // but it is semantically sound.  See Item 10 from Effective Java for more details.
  @Override public int hashCode() { return 42; }
}

public class CIS {
  public static void main(String[] args) {
   CIString a = new CIString("Polish");
   String s = "polish";

   List<CIString> list = new ArrayList<CIString>();
   list.add(a);
   System.out.println("list contains s:" + list.contains(s));
 }
}
Joachim Sauer
  • 302,674
  • 57
  • 556
  • 614
adnan
  • 196
  • 7
  • Minor nitpicking: your example doesn't implement or hint at implementing a hashcode() method, which means that CIString doesn't fulfill the contract, and some equal objects will have unequal hashcodes. I know that's not your question, but for the sake of later readers who google up your question, can you at least put in a note about that? – CPerkins Sep 22 '11 at 15:00
  • I've added a hash function and a comment about the need for it. – adnan Sep 22 '11 at 17:20
  • See my second edit for clarification please – TofuBeer Sep 22 '11 at 17:51
  • by the way [Collections Javadoc](http://download.oracle.com/javase/1.4.2/docs/api/java/util/Collection.html#contains%28java.lang.Object%29) you refer to is for JDK 1.4 - that looks err strange with code below using generics. Anyway javadoc text remains the same ion JDK 1.5 / 1.6 so this is a minor point – gnat Sep 23 '11 at 06:08

2 Answers2

3

It is early in the morning, so perhaps I am missing the true point of your question, this code will fail:

public class CIS 
{
    public static void main(String[] args) 
    {
        CIString a = new CIString("Polish");
        String s = "polish";

        List<String> list = new ArrayList<String>();
        list.add(s);
        System.out.println("list contains a:" + list.contains(a));
    }
}

At the very least it is odd that your code finds it and my code does not (from the point of view of sanity, not that that is clearly how your code is written :-)

Edit:

public class CIS {
  public static void main(String[] args) {
   CIString a = new CIString("Polish");
   String s = "polish";

   List<CIString> list = new ArrayList<CIString>();
   list.add(a);
   System.out.println("list contains s:" + list.contains(s));

   List<String> list2 = new ArrayList<String>();
   list2.add(s);
   System.out.println("list contains a:" + list2.contains(a));
 }
}

Now the code prints out:

list contains s:false
Calling CIString.equals from Polish
list contains a:true

Which still doesn't make sense... and is very fragile. If two objects are equal like a.equals(b) then they must also be equal like b.equal(a), that is not the case with your code.

From the javadoc:

It is symmetric: for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.

So, yes, the example in the book may be contrary to the Javadoc of the collections API, but the principle is correct. One should not create an equals method that behaves oddly or eventually problems will arise.

Edit 2:

The key point of the text is:

In Sun’s current implementation, it happens to return false, but that’s just an implementation artifact. In another implementation, it could just as easily return true or throw a run-time exception. Once you’ve violated the equals contract, you simply don’t know how other objects will behave when confronted with your object.

However, given that the Javadoc says what it says it would seem that the behaviour is fixed not an implementation artifact.

If it wasn't in the javadoc, or if the javadoc is not meant to be part of the specification, then it could change at a later date and the code would no longer work.

TofuBeer
  • 60,850
  • 18
  • 118
  • 163
  • when you run your sample, does his "`Calling CIString.equals ...`" message appear in the output? – CPerkins Sep 22 '11 at 15:27
  • Just to clarify, I get the same output at Bloch, `list contains s:false` - the println message would have been easier to understand if written as `"list.contains(a) returns " + list.contains(a)` – adnan Sep 22 '11 at 15:54
  • 1
    Your point is well-taken, however the nub of my question is whether Bloch's claim is holds, namely that you could potentially see `true` (or an exception) as possibilities (and it still seems to me the answer is no). – adnan Sep 22 '11 at 17:25
  • yeah words _In Sun’s current implementation_ caught my attention, too. I wasted few hours studying JDK 1.6, 1.5 and even 1.4 javadocs and it still sounds senseless to me. The only way to make sense of it I could imagine is to treat word _Sun_ as a typo and read instead as _in current implementation_ - assuming that this refers to current implementation of objects participating in statement `list.contains(s)` – gnat Sep 23 '11 at 07:56
1

In the copy of the book I look at now (2nd Edition), item number is 8 and the whole section about symmetry requirement is presented pretty poorly.

Particular issue you mention seem to be caused by usage code being too close to implementation, obscuring the point author is trying to make. I mean, I look at list.contains(s) and I see ArrayList and String through it and all the reasoning about returning true or throwing exception makes zero sense to me, really.

  • I had to move the "usage code" further away from implementation to get the idea of how it may be:

    void test(List<CIString> list, Object s) {
        if (list != null && list.size() > 0) {
            if (list.get(0).equals(s)) { // unsymmetric equality in CIString
                assert !list.contains(s); // "usage code": list.contain(s)
            }
        }
    }
    

Above looks weird but as long as list is our ArrayList and s is our String, test passes.

Now, what will happen if we use something else instead of String? say, what happens if we pass new CIString("polish") as the second argument?

Look, despite passing through first equals check, assertion fails at the next line - because contains will return true for this object.


Similar reasoning applies for the part where Bloch mentions exception. This time, I kept the second parameter as String, but for first one, imagined an List implementation other than ArrayList (that's legal isn't it).

  • You see, List implementations are generally allowed to throw ClassCastException from contains, we just need to get one that does exactly that and use it for our test. One that comes to mind could be based on TreeSet wrapped around our original list with appropriate comparator.

    List<CIString> wrapperWithCce(List<CIString> original,
            Comparator<CIString> comparator) {
        final TreeSet<CIString> treeSet = new TreeSet<CIString>(comparator);
        treeSet.addAll(original);
        return new ArrayList<CIString>() {
            { addAll(treeSet); }
            @Override
            public boolean contains(Object o) {
                return treeSet.contains(o); // if o is String, will throw CCE
            }
        };
    }
    

What happens if we pass list like above and String "polish" to test? list.get(0).equals(s) will still pass the check but list.contains(s) will throw ClassCastException from TreeSet.contains().

This seem to be like the case Bloch had in mind when he mentioned that list.contains(s) may throw an exception - again, despite passing through first equals check.

gnat
  • 6,213
  • 108
  • 53
  • 73
  • The idea that a `List` implementation could legitimately throw `ClassCastException` seems odd to me. It seems to me that asking a list of `Cat` if it contains `Fido` should simply cause it to respond `false`. – supercat Jun 24 '14 at 18:43
  • @supercat maybe API designers wanted `false` to indicate something like _this object isn't there, but you can add it to collection_ - if this is the case, CCE signals that object not only isn't there, but one can't even add it – gnat Jun 24 '14 at 18:58
  • If a method is supposed to say whether a collection would be able to accept a particular object, then it should be called something like `CanAccept`, which could return an enum-like class which could say "This object can't accept anything", "This object can't accept anything of that type", "This object can't accept that particular instance (but might accept others of that type)", "This object will probably accept that instance" [but e.g. the object supports multi-threaded use, and another thread might add a conflicting item], or "This object will definitely be able to accept that instance." – supercat Jun 24 '14 at 19:08
  • @supercat I gave it a bit more thought and, you know, I can't figure a solid compelling argument in favor of throwing CCE against returning false in `contains`, and in all other Collection methods for that matter. The only thing that comes to mind is API designers wanted collections behavior somehow resembling one of plain arrays ([ArrayStoreException](http://docs.oracle.com/javase/7/docs/api/java/lang/ArrayStoreException.html)) but that's pretty weak – gnat Jun 25 '14 at 15:27
  • Some parts of Java and .NET and the associated frameworks seem to be very well engineered; unfortunately, even though collections are rather fundamental to both, neither seems to have done a good job with the associated interfaces. A fundamental problem, I think, stems from a desire to have the type system indicate what can be done with, or assumed about, a collection, rather than providing means by which instances can describe their characteristics and abilities *in detail*. There are many kinds of things which can be done with almost any collection, but which could be done... – supercat Jun 25 '14 at 16:09
  • ...orders of magnitude faster by code which knows something about a collection than by code which doesn't. For example, code which receives an `Iterator` or `IEnumerable` and wishes to have something which will forevermore contain the same sequence of strings as that particular one does now could create a new immutable list with items copied from the original, but if the original already *is* immutable, it would be much more effective to simply copy the original reference. Unfortunately, there's no way in either .NET or Java by which a sequence can say it's immutable. – supercat Jun 25 '14 at 16:16