I stumbled upon the following problem.
I want a hashset with all numbers from 1 to 100.000.000.
I tried the following code:
var mySet = new HashSet<int>();
for (var k = 1; k <= 100000000; k++)
mySet.Add(k);
That code didn't make it since I got memory overflow somewhere around the 49mil. This was also pretty slow and memory grew excessively.
Then I tried this.
var mySet = Enumerable.Range(1, 100000000).ToHashSet();
where ToHashSet() is the following code:
public static HashSet<T> ToHashSet<T>(this IEnumerable<T> source)
{
return new HashSet<T>(source);
}
I got a memory overflow again but I was able to put in more numbers then with the previous code.
The thing that does work is the following:
var tempList = new List<int>();
for (var k = 1; k <= 100000000; k++)
tempList.Add(k);
var numbers = tempList.ToHashSet();
It takes about 800ms on my system to just fill the tempList where the Enumerable.Range() only takes 4 ticks!
I do need that HashSet or else it would take to much time to lookup values (I need it to be O(1)) and it would be great if I could do that the fastest way.
Now my question is:
Why do the first two methods cause a memory overflow where the third doesn't?
Is there something special HashSet does with the memory on initializing?
My system has 16GB memory so i was quite surprised when I got the overflow exceptions.