1

The hash of an Int32 is the value of the Int32 via:
Hashtables (Dictionary etc) with integer keys

As such, what value is added by using a Dictionary(Of Integer, someObject) (or any hashing collection, for that matter)?

I will, of course, need to use .Contains(integerKey) for either to prevent errors... but I can skip the hashing algorithm altogether, right?

What type would you use to optimize insertion/retrieval?

EDIT: I expect that I may perform on the order of 10^5 lookups and 10^3 insertions, and these operations are certainly not the bottleneck of my process.

Community
  • 1
  • 1
Matthew
  • 10,244
  • 5
  • 49
  • 104

2 Answers2

4

Unless the numbers form a range 0...x (in which case you could just use a List<T> or even just an array) I would still go for the Dictionary<int, Whatever> approach. It's simple, it works, and it's almost certainly going to perform fast enough for you.

This really sounds like micro-optimization which ought to be skipped until you've proved that you've got a problem. How often are you going to be looking up items in the dictionary, compared with other operations?

EDIT: As Timwi says, there are indeed potential savings to be made here if this is really performance-critical. Without a generic key type and the virtual method calls to fetch hash codes and compare values, you could certainly do better. But I wouldn't trust any third-party collection as much as the built-in ones, and I certainly wouldn't trust my own collection implementations for anything non-trivial without a huge amount of testing... it would have to be a really significant bottleneck in the application as a whole before I considered moving away from the built-in types.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Thanks for the input. I had a feeling I was micro-optimizing... do you know whether the Dictionary will short-circuit hashing for `Int32` values or will it proceed with the hashing (and return the same value)? – Matthew Apr 04 '11 at 16:46
  • 2
    @Matthew: It calls Int32.GetHashCode, which just returns the int - which is very fast. – Reed Copsey Apr 04 '11 at 16:48
  • While you are right that this *may* be micro-optimisation in some cases, if we assume for a moment that it is performance-critical code, the generic Dictionary is incredibly slow. This may well be unavoidable in a generic class. But a custom hash table class I wrote specifically for `int` keys (and another for `long` keys) is several tens of times faster. – Timwi Apr 04 '11 at 16:54
  • @Timwi: I would still use the built-in collection until I'd *proved* a problem. Which exact sentence here do you disagree with? Do you have any idea how many lookups Matthew will perform? I can't see that information anywhere in the question. I still wouldn't try to optimize this until I knew it was significant. Saving 90% of 0.01% of execution time is still going to be insignificant... – Jon Skeet Apr 04 '11 at 16:54
  • 1
    @Timwi: You've added **if we assume for a moment that it is performance-critical code** since I added my comment. I don't think we *should* assume that... partly because developers often assume it with no evidence, and I think it's better to only start building one's own collection when you have *evidence* that it's worth doing so. – Jon Skeet Apr 04 '11 at 16:55
  • 1
    @Reed: That is not entirely correct. It calls `IEqualityComparer.GetHashCode(int)`, which in the case of the default equality comparer calls `IEquatable.GetHashCode()`, which in turn returns the integer. There are two levels of interface method resolution before it gets to the integer... – Timwi Apr 04 '11 at 16:55
  • The only time I've seen performance gains from a custom hashtable is with Guids as the GetHashCode override on Guid used to take object, incurring boxing. As Jon says, premature optimisation until you can see that the dictionary is causing the issues. I've never seen this as the case, aside from the Guid example I gave. – Adam Houldsworth Apr 04 '11 at 16:56
  • @Adam: It is unclear what you are talking about. `Guid` implements `IEquatable`, which the default equality comparer will use. – Timwi Apr 04 '11 at 17:00
  • @Timwi: Perhaps this is back in the .NET 1.1 days, before `IEquatable`? I've just checked and Guid has implemented `IEquatable` since .NET 2.0. – Jon Skeet Apr 04 '11 at 17:02
  • @Jon: Well, of course we both agree that we know little to nothing about the OP’s specific case. But the last time *I* ran into this, I couldn’t really tell that the dictionary was slow until I went ahead and wrote my own int-keyed hashtable. It was well worth it then, but perhaps only because I already knew that my code was performance-critical. – Timwi Apr 04 '11 at 17:02
  • @Jon I believe it was actually, never mind, I remembered incorrectly. – Adam Houldsworth Apr 04 '11 at 17:02
  • @Timwi: Did you have a good idea of how many lookups it was performing in what space of time? That's the first thing Matthew should find out to see if this has a hope of being performance-critical, IMO. – Jon Skeet Apr 04 '11 at 17:03
  • @Reed: That would be interesting, actually... it would remove one layer of virtual method calls; if the optimization in Timwi's implementation was *just* due to the removal of those method calls, it could still make a significant difference. Will try to test it tonight... – Jon Skeet Apr 04 '11 at 17:04
  • @Timwi: That's actually not true either - It uses `IEqualityComparer.CreateComparer`, which in turn creates a `GenericEqualityComparer`, which does call `Int32.GetHashCode` for its implementation. This in turn, just returns the int. However, once this comparer has been created, there's only 1 level of indirection here (Dictionary's this.comparer.GetHashCode(key)`) - You could simplify this by making the comparer just return the int, and save a single method call, but I doubt there would be much in the way of gains here, unless it's very performance critical. – Reed Copsey Apr 04 '11 at 17:07
  • @Jon: Yes - you can eliminate one method call by writing your own comparer. It'd be very interesting to see if the extra method call makes any different post-JIT, though. – Reed Copsey Apr 04 '11 at 17:09
  • 1
    @Reed: Just tried using a custom `IEqualityComparer` - it was actually a bit slower than the default. Not sure why. – Jon Skeet Apr 04 '11 at 17:10
  • Note that the generic-ness of the Dictionary class makes a big difference in the case of value-type keys. If a non-generic keyed collection, such as Hashtable, is substituted for the Dictionary, performance would suffer by an order of magnitude or two, as every lookup would incur boxing penalties that are not incurred by lookups against Dictionary. – JohnC Apr 04 '11 at 17:11
  • @JohnC: Absolutely. I'm assuming Timwi was genuinely comparing against the generic dictionary though - I have no reason to believe otherwise :) – Jon Skeet Apr 04 '11 at 17:12
  • @Reed: Indeed. This is surprising. `IEqualityComparer.CreateComparer` uses `GenericEqualityComparer` whenever `T` implements `IEquatable`, so I blindly assumed that `GenericEqualityComparer.GetHashCode(T)` would use that interface. It is interesting that it doesn’t, although `GenericEqualityComparer.Equals(T,T)` does... – Timwi Apr 04 '11 at 17:14
  • This thread is going on too fast, I think I’ll stop trying to keep up :) – Timwi Apr 04 '11 at 17:15
2

The Dictionary will just call GetHashCode on the type, so for Int32 I would imagine this to be pretty quick. Basically, I think it's already optimised enough for you.

What type I use depends on the type of the key to the value, I tend not to worry about performance in most uses.

Adam Houldsworth
  • 63,413
  • 11
  • 150
  • 187