HashSet limit - how to proceed?

Question

My program creates custom objects, I want to get a distinct list of. So I want to use a set and add object by object. The set would prevent duplicates. And at last I have a set of unique objects.

I would usually use a HashSet, because I don't need a sorted set. Only, there are so many different potential objects. More than 2^32. The GetHashCode function returns an int, so this cannot work as a unique key for my objects.

I assume that I cannot use the HashSet hence and must use the slower SortedSet and have my object implement IComparable / CompareTo. Is this correct? Or is there a way to have a HashSet with long hash codes?

A HashSet cannot contain more than 2^31 items. This will limit you as well. — usr, Jul 06 '14 at 22:36

i3arnon · Accepted Answer · 2014-07-06T21:55:08.007

7

GetHashCode does return an int, but if the comparison for the hash codes determines they are the same, it folllows by calling the Equals method (which you should override).

So, no, you don't have to switch. You can keep using the same old lovable HashSet (as long as you don't run out of memory).

edited Jul 06 '14 at 21:55

answered Jan 05 '14 at 15:38

i3arnon

113,022
33
324
344

1

A hash code does not need to be unique, that is what hashing is all about. Creating hash codes are creating small identifiers for large objects, that obviously is never going to create an unique code for all possible object states. – PMF Jan 05 '14 at 15:51
@I3arnon: Thank you very much for the simple and clear explanation. – Thorsten Kettner Jan 05 '14 at 16:29
@ThorstenKettner sure, any time. – i3arnon Jan 05 '14 at 16:29
1

@PMF: You are right. Only, I didn't know that the HashSet uses the Equals method at all, so it seemed to me it used the hash code as a unique identifyer (one object per bucket). Now I know better. Thanks for your comment. – Thorsten Kettner Jan 05 '14 at 16:29

HashSet limit - how to proceed?

1 Answers1