HashSet
is called HashSet
because it uses HashMap
to do its work. HashMap
is a very handy structure that allows you to find information related to some key very quickly, as long as that key has a nice hash function defined for it.
Trivially, if a set was implemented using linked lists, it would be called LinkedListSet
and not HashSet
, and it would be much, much slower. Ditto for arrays.
A PRESENT
singleton is used simply because HashMap
needs to store something; it does not matter what it is for purposes of HashSet
as long as something is either there or not, so might as well always be the same thing.
Before Set
came to JavaScript and Perl, you would see this pattern very often, where one would simply take an object (JS) or a hash (Perl) and stuff a true
or 1
in it for every present member. So even without the dedicated HashMap
object, the optimal solution was basically the same idea.
It would be somewhat more memory-efficient to implement the same functionality on a bit-vector, since the only values allowed are non-present or present, but it would involve more work and duplicating existing functionality. The part which finds which index of the array holds the value for which key would be the same, though.