-2

Coming from a c++ world, I find reading of the HashSet documentation somewhat hard:

In c++, you would have:

which in turns points to:

Which makes it obvious the requirement for the type of element handled by a std::set. My question is: What are the requirements for the type (E) of elements maintained by a Set in Java ?

Here is a short example which I fail to understand:

import gdcm.Tag;
import java.util.Set;
import java.util.HashSet;

public class TestTag
{
  public static void main(String[] args) throws Exception
    {
      Tag t1 = new Tag(0x8,0x8);
      Tag t2 = new Tag(0x8,0x8);
      if( t1 == t2 )
        throw new Exception("Instances are identical" );
      if( !t1.equals(t2) )
        throw new Exception("Instances are different" );
      if( t1.hashCode() != t2.hashCode() )
        throw new Exception("hashCodes are different" );
      Set<Tag> s = new HashSet<Tag>();
      s.add(t1);
      s.add(t2);
      if( s.size() != 1 )
        throw new Exception("Invalid size: " + s.size() );
    }
}

The above simple code fails with:

Exception in thread "main" java.lang.Exception: Invalid size: 2 at TestTag.main(TestTag.java:42)

From my reading of the documentation only the equals operator needs to be implemented for Set:

What am I missing from the documentation ?

Jens
  • 67,715
  • 15
  • 98
  • 113
malat
  • 12,152
  • 13
  • 89
  • 158
  • Every object has `hashCode` implemented, as the default impl is present in `Object`. Whether that can/should be optimized is another question. – Adam Kotwasinski Oct 02 '17 at 06:41
  • Can you show the implementation of the Tag class? – Eran Oct 02 '17 at 06:42
  • @Eran the whole point of my question is to answer it without the actual Tag implementation (language concept tag). – malat Oct 02 '17 at 06:43
  • 2
    @malat Your exception makes no sense. If `t1.equals(t2)` and `t1.hashCode() == t2.hashCode()`, `s.size()` should be 1, since `HashSet` would consider `t1` and `t2` to be identical. Perhaps the code that gave you the exception you mentioned is not the code you posted. – Eran Oct 02 '17 at 06:46
  • 1
    Only problem is, without the `Tag` implementation, you do not have a [Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve). For this reason, it could be argued that this question is off-topic, as I cannot reproduce your issue. – Joe C Oct 02 '17 at 06:48
  • 1
    The C++ `set` *"contains a sorted set of unique objects of type Key. Sorting is done using the key comparison function **`Compare`**"*. The Java equivalent of that is [`TreeSet`](https://docs.oracle.com/javase/8/docs/api/java/util/TreeSet.html) (not `HashSet`): *The elements are ordered using their natural ordering, or by a **`Comparator`** provided at set creation time, depending on which constructor is used.* – Andreas Oct 02 '17 at 06:48
  • Is it transitive? ie. You've shown `t1.equals(t2)` but does `t2.equals(t1)`? If the `hashCode` is equal, and `.equals` returns true, that is all you need to know. The Set will only keep one element. – matt Oct 02 '17 at 06:53
  • 1
    As mentioned by @Eran it's not possible to get the exception you claim you got with the code you posted. Unless you broke `equals` or `hashCode` on purpose Please post your actual code. – Oleg Oct 02 '17 at 07:01

3 Answers3

3

I just tried to reproduce your issue, and maybe you just didn't override equals and/or hashSet correctly.

Take a look at my incorrect implemenation of Tag:

public class Tag {

private int x, y;

public Tag(int x, int y) {
    this.x = x;
    this.y = y;
}

public boolean equals(Tag tag) {
    if (x != tag.x) return false;
    return y == tag.y;
}

@Override
public int hashCode() {
    int result = x;
    result = 31 * result + y;
    return result;
}
}

Looks quite ok doesn't it? But the problem is, I actually do not override the correct equals method, I overloaded it with my own implementation.

To work correctly, equals has to look like this:

@Override
public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;

    Tag tag = (Tag) o;

    if (x != tag.x) return false;
    return y == tag.y;
}
ChristophE
  • 760
  • 1
  • 9
  • 21
  • Nice way to break it. I just used a boolean that returns true first time and then false. – Oleg Oct 02 '17 at 07:34
  • Indeed I used the wrong signature for `equals` function, kindda hard to track in my case since I used swig to generate the java code. – malat Oct 02 '17 at 07:38
  • As a general tip when programming in Java: Always use the @Override annotation. This also helps if you sometime in the future change the signature of a super class and then suddenly the subclasses behave strange. – ChristophE Oct 02 '17 at 07:45
  • As said above the java code was generated by an engine: swig. But I'll keep this trick burried in mind now :) – malat Oct 02 '17 at 07:49
1

What am I missing from the documentation ?

You are looking at the wrong part of the documentation.

The C++ set is an "sorted set of unique objects", and are "usually implemented as red-black trees."

In Java, Set is a more abstract concept (it's an interface, not a class) with multiple implementations, most notably the HashSet and the TreeSet (ignoring concurrent implementations).

As you can probably guess from the name alone, the Java TreeSet is the equivalent of the C++ set.

As for requirements, HashSet uses the hashCode() and equals() methods. They are defined on the Object class, and needs to be overridden on classes that needs to be in a HashSet or as keys in a HashMap.

For TreeSet and keys of TreeMap, you have two options: Provide a Comparator when creating the TreeSet (similar to C++), or have the objects implement the Comparable interface.

Andreas
  • 154,647
  • 11
  • 152
  • 247
  • Thanks for actually reading my question carefully, indeed now I see the correct signature for `equals()` method. – malat Oct 02 '17 at 07:44
  • I now understand that in Java `Set` is in fact `Set` which may be obvious to all java devs (which explains the downvote), but that is counterintuitive to most c++ programmer. – malat Oct 02 '17 at 07:46
  • 1
    @malat The downvote (at least mine) is because you didn't post the relevant code which would've made answering you simple. "Set is in fact Set" that's far from being correct, java generics are much more complicated then that and they are very different then c++ templates. – Oleg Oct 02 '17 at 08:04
0

I guess this was simply a combination of bad luck and misunderstanding of HashSet requirement. Thanks to @christophe for help, I realized the issue when I tried adding in my swig generated Tag.java class:

@Override
public boolean equals(Object o) {
}

I got the following error message:

gdcm/Tag.java:78: error: method does not override or implement a method from a supertype
  @Override
  ^
1 error
1 warning

Which meant my error was simply:

  • I had the wrong signature in the first place: boolean equals(Object o) != boolean equals(Tag t)

The hint was simply to use the @Override keyword.


For those asking for the upstream code, the Java code is generated by swig. The original c++ code is here:

malat
  • 12,152
  • 13
  • 89
  • 158