On its own, the equals
method is perfectly sufficient to implement a set correctly, but not to implement it efficiently.
The point of a hash code or a comparator is that they provide ways to arrange objects in some ordered structure (a hash table or a tree) which allows for fast finding of objects. If you have only the equals
method for comparing pairs of objects, you can't arrange the objects in any meaningful or clever order; you have only a loose jumble of objects.
For example, with only the equals
method, ensuring that objects in a set are unique requires comparing each added object to every other object in the jumble. Adding n objects requires
n * (n - 1) / 2
comparisons. For 5 objects that's 10 comparisons, which is fine, but for 1,000 objects that's 499,500 comparisons. It scales terribly.
Because it would not give scalable performance, no such set implementation is in the standard library.
If you don't care about hash table performance, this is a minimal implementation of the hashCode
method which works for any class:
@Override
public int hashCode() {
return 0; // or any other constant
}
Although it is required that equal objects have equal hash codes, it is never required for correctness that inequal objects have inequal hash codes, so returning a constant is legal. If you put these objects in a HashSet
or use them as HashMap
keys, they will end up in a jumble in a single hash table bucket. Performance will be bad, but it will work correctly.
Also, for what it's worth, a minimal working Set
implementation which only ever uses the equals
method would be:
public class ArraySet<E> extends AbstractSet<E> {
private final ArrayList<E> list = new ArrayList<>();
@Override
public boolean add(E e) {
if (!list.contains(e)) {
list.add(e);
return true;
}
return false;
}
@Override
public Iterator<E> iterator() {
return list.iterator();
}
@Override
public int size() {
return list.size();
}
}
The set stores objects in an ArrayList
, and uses list.contains
to call equals
on objects. Inherited methods from AbstractSet
and AbstractCollection
provide the bulk of the functionality of the Set
interface; for example its remove
method gets implemented via the list iterator's remove
method. Each operation to add or remove an object or test an object's membership does a comparison against every object in the set, so it scales terribly, but works correctly.
Is this useful? Maybe, in certain special cases. For sets that are known to be very tiny, the performance might be fine, and if you have millions of these sets, this could save memory compared to a HashSet
.
In general, though, it is better to write meaningful hash code methods and comparators, so you can have sets and maps that scale efficiently.