4

I'm aware that exception trapping can be expensive, but I'm wondering if there are cases when it's actually less expensive than a lookup?

For example, if I have a large dictionary, I could either test for the existence of a key:

If MyDictionary.ContainsKey(MyKey) Then _
  MyValue = MyDictionary(MyKey) ' This is 2 lookups just to get the value.

Or, I could catch an exception:

Try
  MyValue = MyDictionary(MyKey) ' Only doing 1 lookup now.
Catch(e As Exception)
  ' Didn't find it.
End Try

Is exception trapping always more expensive than lookups like the above, or is it less so in some circumstances?

John Saunders
  • 160,644
  • 26
  • 247
  • 397
ingredient_15939
  • 3,022
  • 7
  • 35
  • 55
  • I have edited your title. Please see, "[Should questions include “tags” in their titles?](http://meta.stackexchange.com/questions/19190/)", where the consensus is "no, they should not". – John Saunders Dec 03 '12 at 18:21

3 Answers3

5

The thing about dictionary lookups is that they happen in constant or near-constant time. It takes your computer about the same amount of time whether your dictionary holds one item or one million items. I bring this up because you're worried about making two lookups in a large dictionary, and reality is that it's not much different from making two lookups in a small dictionary. As a side note, one of the implications here is that dictionaries are not always the best choice for small collections, though I normally find the extra clarity still outweighs any performance issues for those small collections.

One of the things that determines just how fast a dictionary can make it's lookups is how long it takes to generate a hash value for a particular object. Some objects can do this much faster than others. That means the answer here depends on the kind of object in your dictionary. Therefore, the only way to know for sure is to build a version that tests each method a few hundred thousand times to find out which completes the set faster.

Another factor to keep in mind here is that it's mainly just the Catch block that is slow with exception handling, and so you'll want to look for the right combination of lookup hits and misses that reasonably matches what you'd expect in production. For this reason, you can't find a general guideline here, or if you do it's likely to be wrong. If you only rarely have a miss, then I would expect the exception handler to do much better (and, by virtue of the a miss being somewhat, well, exceptional, it would also be the right solution). If you miss more often, I might prefer a different approach

And while we're at it, let's not forget about Dictionary.TryGetValue()

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
  • +1. I was going to suggest TryGetValue - good thing you mentioned it. But even then dictionary is able to perform 9 billion lookups per second (I tested it once), so this is normally not the bootleneck. – Victor Zakharov Dec 03 '12 at 18:21
  • @Neolisk I suppose that depends on the hashing function used by the objects and how many objects and potential collisions in the dictionary. – Magnus Dec 03 '12 at 18:25
  • @Magnus: of course, see my synthetic test below. Not as fast as I mentioned here (I probably tested on integers as keys last time), but still rather fast compared to the `TryCatch` approach. – Victor Zakharov Dec 03 '12 at 18:46
  • @Joel Great answer by the way – Magnus Dec 03 '12 at 18:56
1

I tested performance of ContainsKey vs TryCatch, here are the results:

With debugger attached:

enter image description here

Without debugger attached:

enter image description here

Tested on Release build of a Console application with just the Sub Main and below code. ContainsKey is ~37000 times faster with debugger and still 355 times faster without debugger attached, so even if you do two lookups, it would not be as bad as if you needed to catch an extra exception. This is assuming you are looking for missing keys quite often.

Dim dict As New Dictionary(Of String, Integer)
With dict
  .Add("One", 1)
  .Add("Two", 2)
  .Add("Three", 3)
  .Add("Four", 4)
  .Add("Five", 5)
  .Add("Six", 6)
  .Add("Seven", 7)
  .Add("Eight", 8)
  .Add("Nine", 9)
  .Add("Ten", 10)
End With

Dim stw As New Stopwatch
Dim iterationCount As Long = 0
Do
  stw.Start()
  If Not dict.ContainsKey("non-existing key") Then 'always true
    stw.Stop()
    iterationCount += 1
  End If
  If stw.ElapsedMilliseconds > 5000 Then Exit Do
Loop

Dim stw2 As New Stopwatch
Dim iterationCount2 As Long = 0
Do
  Try
    stw2.Start()
    Dim value As Integer = dict("non-existing key") 'always throws exception
  Catch ex As Exception
    stw2.Stop()
    iterationCount2 += 1
  End Try
  If stw2.ElapsedMilliseconds > 5000 Then Exit Do
Loop

MsgBox("ContainsKey: " & iterationCount / 5 & " per second, TryCatch: " & iterationCount2 / 5 & " per second.")
Victor Zakharov
  • 25,801
  • 18
  • 85
  • 151
  • 2
    Was that test run with or without a debugger attached? An attached debugger can greatly increase the overhead of exception handling. Also, if the `Dictionary` contains many different values that have the same hash code, lookups may be arbitrarily slow. For example, define a simple two-field structure which doesn't override `Equals` or `GetHashCode()` [I think `Tuple` does override those things, so create a custom struct], and create a `Dictionary` with 10,000 instances of that structure as keys, where the first field holds the same value in all 10,000 instances. – supercat Dec 03 '12 at 19:43
  • 2
    Also: it's mainly entering a catch block that is slow with exception handling. The benchmark here does that every time, but production code may do that only rarely. If most of your lookups are able to avoid the exception handler, you might get a wildly different result. – Joel Coehoorn Dec 03 '12 at 20:09
  • @supercat: With debugger - I will include another set of results without a debugger shortly. Overall, are you trying to say that exception handling will ever be faster than doing ContainsKey for real production use? Perhaps, but I have yet to see an example of this. You can elaborate in your answer btw. – Victor Zakharov Dec 03 '12 at 20:29
  • @JoelCoehoorn: I see your point here, but again, cannot imagine a production application where you would rather use `TryCatch` instead of `TryGetValue`. If you can provide a real world (even theoretical) example of such decision, would be interesting to know. Thanks for your feedback. – Victor Zakharov Dec 03 '12 at 20:35
  • @Neolisk: Situations exist where `ContainsKey` would take longer than a thrown exception. I would guess that nearly all such situations are the result of bad `GetHashCode` implementations that should be fixed, but the question was "is try/catch *ever* less expensive". If one wants to try reading a value, one should use `TryGetValue`. The only realistic situation I can think of where using try/catch for flow control on a dictionary would be efficient would be when adding a value, if one expects that in 99.9% of cases the value won't exist, and... – supercat Dec 03 '12 at 20:49
  • ...in those few cases where it does exist one will not want to overwrite the pre-existing value (one could use the indexed property setter for a combined add/replace if add/replace semantics were the goal). `ConcurrentDictionary` has some "try" methods for adding keys, but the "normal" dictionary does not. – supercat Dec 03 '12 at 20:51
  • @supercat: I think you meant in 99.9% of cases the value *exists* (so that exception does not happen). In other words, if missing values are very rare, then `Try/Catch` approach will work better. Makes sense, although you would hardly ever save overall processing time by doing this, meaning the bottleneck will probably be elsewhere. I still consider `TryGetValue` to be a generic rule in similar implementations, with other cases being exceptions to it. Put simple, `TryGetValue` will be faster in most of the use cases. **Disclaimer:** This a personal statement, not a fact. :) – Victor Zakharov Dec 03 '12 at 21:25
  • 1
    @Neolisk: Using `TryGetValue` is a no-brainer. The tricky case is when trying to add something to the collection only if it isn't already there. The `Add` method throws an exception in the "item exists" case; the indexed property setter silently replaces an item. If one wants to do something other than bomb or replace an item, one must either pre-check for the item's existence or else attempt the operation. The case where the item exists is the slow one. – supercat Dec 03 '12 at 21:31
  • @supercat: ok, now I see why you mentioned `ConcurrentDictionary` - the regular dictionary is missing an equivalent of [`.GetOrAdd`](http://msdn.microsoft.com/en-us/library/ee378676.aspx) method to make it in one shot. Thanks for clarification. – Victor Zakharov Dec 03 '12 at 21:37
  • @Neolisk: Incidentally, even in cases where the try/catch would be faster, I favor using the pre-check except in cases where the pre-existence of an item in the dictionary should be regarded as a "recoverable problem"; conversely, in those cases where it should be regarded as a "recoverable problem", I would favor try/catch unless it would pose an unacceptably severe performance hit. – supercat Dec 03 '12 at 21:41
0

If you are trying to find an item in a data structure of some kind which is not easily searched (e.g. finding an item containing the word "flabbergasted" in an unindexed string array of 100K items, then yes, letting it throw the exception would be faster because you'd only be doing the look-up once. If you check if the item exists first, then get the item, you are doing the look-up twice. However, in your example, where you are looking up an item in a dictionary (hash table), it should be very quick, so doing the lookup twice would likely be faster than letting it fail, but it's hard to say without testing it. It all depends how quickly the hash value for the object can be calculated and how many items in the list share the same hash value.

As others have suggested, in the case of the Dictionary, the TryGetValue would provide the best of both methods. Other list types offer similar functionality.

Steven Doggart
  • 43,358
  • 8
  • 68
  • 105