0

I have tried a little bit here. But I don't understand why TreeSet can filter out the duplicate entries here, but HashSet can't.


public class DataClass implements Comparable<DataClass> {
    private final String data;

    public DataClass(String data) {
        this.data = data;
    }

    @Override
    public boolean equals(Object other) {
        if (other == null || other.getClass() != this.getClass()) return false;
        DataClass dc = (DataClass) other;
        return this.data.length() == dc.data.length();
    }

    @Override
    public int compareTo(DataClass dc) {
        return Integer.compare(this.data.length(), dc.data.length());
    }

    @Override
    public String toString() {
        return data;
    }
}

Here is my main Method:

DataClass[] data = new DataClass[4];
        data[0] = new DataClass("Hans");
        data[1] = new DataClass("Tom");
        data[2] = new DataClass("Fred");
        data[3] = new DataClass("Sia");

    
        System.out.println();
        TreeSet<DataClass> treeSet = new TreeSet<>();
        for (DataClass dc : data) treeSet.add(dc);
        for (DataClass dc : treeSet)
            System.out.print(dc+" ");

        System.out.println();
        HashSet<DataClass> hashSet = new HashSet<>();
        for (DataClass dc : data)
            hashSet.add(dc);
        for (DataClass dc : hashSet)
            System.out.print(dc+" ");

The ouput is:

Tom Hans 
Hans Tom Fred Sia 

Doesn't HashSet pay attention to equals() method? I haven't found anything in the documentation now either. Did I miss something?

  • 1
    Or use just `data.length()` as an `int` value is already sufficient as its own hash code. – Holger Apr 16 '21 at 11:44
  • However, on the official documentation I don't see how HashSet does the comparisons ([See Documentation](https://docs.oracle.com/javase/8/docs/api/java/util/HashSet.html)). Where can I get this information in the future that equals() and hashCode() must be the same for this class? – JavaBeginner Apr 16 '21 at 15:12
  • look at the .add() function in the documentation, the link is here: https://docs.oracle.com/javase/8/docs/api/java/util/HashSet.html#add-E- – null_awe Apr 16 '21 at 16:05
  • Thank you very much. However, nothing says that HashSet still involves the hashCode() function. – JavaBeginner Apr 16 '21 at 16:32
  • 1
    Since you have overridden `equals`, you should have looked at [the contract of equals](https://docs.oracle.com/en/java/javase/16/docs/api/java.base/java/lang/Object.html#equals(java.lang.Object)) to see, how to do it correctly. And it says: “*Note that it is generally necessary to override the `hashCode` method whenever this method is overridden, so as to maintain the general contract for the `hashCode` method, which states that equal objects must have equal hash codes.*” – Holger Apr 19 '21 at 08:42

3 Answers3

7

You need to override the hashCode() method in order for the HashSet to not contain duplicates, since the HashSet only checks for equality if there are collisions. However, TreeSet doesn't require hashCode() to be overridden. See: HashSet contains duplicate entries

Holger
  • 285,553
  • 42
  • 434
  • 765
null_awe
  • 483
  • 6
  • 17
  • However, on the official documentation I don't see how HashSet does the comparisons ([See Documentation](https://docs.oracle.com/javase/8/docs/api/java/util/HashSet.html)). Where can I get this information in the future that equals() and hashCode() must be the same for this class? – JavaBeginner Apr 16 '21 at 15:11
  • Look at the .add() function in the documentation, the link is here: https://docs.oracle.com/javase/8/docs/api/java/util/HashSet.html#add-E- – null_awe Apr 16 '21 at 15:14
2

HasSet is a hashing based data structure so to consider two object being same, you need to implement hashCode and equals method. If two object return same values for these methods they are considered same.

TreeSet uses compareTo so you are good.

Read about hashCode and equals contract.

Vallabh Patade
  • 4,960
  • 6
  • 31
  • 40
1

TreeSet uses compareTo method to find data whereas HashSet uses hashCode and equals method.

Documentation of HashSet says -

This class implements the Set interface, backed by a hash table (actually a HashMap instance).

And the documentation of HashMap says -

This implementation provides constant-time performance for the basic operations (get and put), assuming the hash function disperses the elements properly among the buckets

That means, hashCode method will be used to locate the bucket and equals method will be used to find exact match.

In your case as you are using TreeMap and you have provided compareTo method -

    @Override
    public int compareTo(DataClass dc) {
        return Integer.compare(this.data.length(), dc.data.length());
    }

TreeSet is comparing objects using it's length but for HashSet all the objects are different as you have not overridden hashCode and equals methods.

Sachin Gorade
  • 1,427
  • 12
  • 32
  • However, on the official documentation I don't see how HashSet does the comparisons ([See Documentation](https://docs.oracle.com/javase/8/docs/api/java/util/HashSet.html)). Where can I get this information in the future that equals() and hashCode() must be the same for this class? – JavaBeginner Apr 16 '21 at 15:13