6

I have a huge DataTable, and I need go by each row and validate an specific value.

Which method give me more performance, an structure of IF ELSE or SELECT CASE? (I'm focused in the method that offer me the best performance)

IF ELSE (METHOD #1)

For Each vRow In vDTtemp.Rows
    If vRow("Item") = "Time" Then
        vRow("Result") = "000"
    ElseIf vRow("Item") = "DateTime" Then
        vRow("Result") = "001"
    ElseIf vRow("Item") = "String" Then
        vRow("Result") = "002"
    Else
        vRow("Result") = "N/A"
    End If
Next

SELECT CASE (METHOD #2)

For Each vRow In vDTtemp.Rows
    Select Case vRow("Item")
        Case "Time"
            vRow("Result") = "000"
        Case "DateTime"
            vRow("Result") = "001"
        Case "String"
            vRow("Result") = "002"
        Case Else
            vRow("Result") = "N/A"
    End Select
Next
MiBol
  • 1,985
  • 10
  • 37
  • 64
  • 1
    What results did you get from actually testing it? Also, (1) it could be faster to reference the columns by index instead of by name and (2) `Select Case` is subject to optimisations in a release version which aren't done in a debug version. – Andrew Morton Feb 28 '13 at 19:37
  • 4
    AFAIK It sthe same. In both cases it will break after finding a true condition. But Select Case is a much nicer and cleaner way when you have to write many If sentences. – Esselans Feb 28 '13 at 19:39
  • 2
    Yes performance should be similar, but Select Case is more readable. – Meta-Knight Feb 28 '13 at 19:44
  • @OwerFlov mmmm, I think that the IF Method #1 can consume more resource. By example, all data has the "String" value, then, the cycle needs consider the Time and the DateTime, will take longer. In the method #2 the select case go directly to the option. – MiBol Feb 28 '13 at 19:47
  • @Andrew Morton: Nice tip use the column index instead of the name. But I want know the better performance in a cycle situation. – MiBol Feb 28 '13 at 19:50
  • 1
    You can use StopWatch and see it for yourself (I don't have a lot of data here to test it myself) as shown in http://www.dreamincode.net/forums/topic/180080-ifelse-vs-select-case-comparison/page__view__findpost__p__1056722 – Esselans Feb 28 '13 at 19:53
  • If you care to maintain your code in the long run, you should not perform *validation* like that. – Victor Zakharov Feb 28 '13 at 20:55
  • Where did the datatable data come from? It could be more efficient to set vRow("Result") at the same time as you populate the datatable. – Andrew Morton Feb 28 '13 at 21:18
  • @AndrewMorton: I'm working with RFC (Remote Function Calls) from SAP. Basically this function retrieve me an a long array of objects. `Dim vSAPdata(,) As Object`. All the information that I manage is in the local memory to transform it into DataTable – MiBol Mar 01 '13 at 00:41
  • @Neolisk: I cannot avoid the cycle and sometimes the information that I need to manage it's huge. At this point an SQL Server doesn't help me much. – MiBol Mar 01 '13 at 00:45

4 Answers4

6

It makes no difference, both code styles generate the exact same IL. Something you can see by running the ildasm.exe tool on your compiled assembly.

In general, the VB.NET compiler does make an effort to optimize a Select statement. That will work when it uses a simple value type as the selector and trivial Case statements. The generated code will use a dedicated IL instruction, Opcodes.Switch. Which will be compiled to machine code that uses a lookup table. Very fast.

That however doesn't work when you use a string expression as the selector. Making a lookup table for that one would require the equivalent of a dictionary of delegates. That's too impactful, the compiler cannot do anything but convert each case statement to the equivalent of an If statement. You can however optimize it yourself easily by creating this Dictionary in your code, easy to do since the dictionary key and value are just simple strings. You don't have enough cases and the strings are too short to make this pay off majorly, although it is worth a try. It certainly can compact your code.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • I don't have a lot experience using Dictionaries, but worth searching more information about that. Thanks! – MiBol Mar 01 '13 at 01:24
5

I've spent quite a lot of time working on this same problem over the last couple of days and have found one approach which is much faster than the others. I too found that using Select Case on a string variable was equivalent to a series of If/Else If statements, and both were disappointingly slow.

However the following technique has worked very well, and reduced the amount of time by over 50%. Instead of the original code:

For Each vRow In vDTtemp.Rows
    Select Case vRow("Item")
        Case "Time"
            vRow("Result") = "000"
        Case "DateTime"
            vRow("Result") = "001"
        Case "String"
            vRow("Result") = "002"
        Case Else
            vRow("Result") = "N/A"
    End Select
Next

Change it around to switch on a simple Boolean, and use the String.Equals method, like this:

For Each vRow In vDTtemp.Rows
    'Read out the row value so we only need to access the datarow once
    rowValue = vRow("Item")
    'Which of these statements is true?
    Select Case True
        Case rowValue.Equals("Time")
            vRow("Result") = "000"
        Case rowValue.Equals("DateTime")
            vRow("Result") = "001"
        Case rowValue.Equals("String")
            vRow("Result") = "002"
        Case Else
            vRow("Result") = "N/A"
    End Select
Next

I've had significant improvements by approaching it in this way, in one case reducing my code from 1.3 seconds over a 100,000 iteration loop to 0.5 seconds. If this is in a really frequently-called time-critical section of code, that can make a big difference.

As pointed out in the comments below however, this performs an "Ordinal" comparison of strings, which may not result in the expected behaviour if non-English locales are being used (see the comments for examples).

Adam.

Adam Dawes
  • 188
  • 2
  • 10
  • 1
    Adam, thanks for the response. I will perform some test with this method. – MiBol Nov 20 '13 at 14:57
  • 4
    This is incompatible with the original code in a dangerous way, one you really need to warn about in your answer. The speedup is easily explained, String.Equals() uses StringComparison.Ordinal. The fastest possible way to compare strings, but with pretty troublesome semantics in various languages. A German user will not understand why ss doesn't match ß. The Turkish I is infamous. Etcetera. The normal way to get this out of an unhacky Select Case statement is `Option Compare Binary`. – Hans Passant Nov 20 '13 at 15:24
  • 1
    That's true and a good point. There are additional parameters that can be passed to the String.Equals method, but in my tests they caused the performance to drop back to the same as (or worse than) the original Select Case statement. In my particular case this is of no concern as I have complete control over the values, but anyone using other locales will definitely need to be aware of this. – Adam Dawes Nov 20 '13 at 16:12
  • @Hans Passant: A very nice point that you remark! I will take in consideration when implement this method. thnks! – MiBol Nov 20 '13 at 20:06
5

Ok... it was long time ago for this post, but now i was searching the same question and i can add new optimization for this. For now i've choosed to use the select case, for be more readable. In the other hand, the performance decreases a lot when a "Dim" is inside a for-next loop.

     For Each vRow In vDTtemp.Rows
------->  Dim rowItem = vRow("Item")
        If rowItem = "Time" Then
            vRow("Result") = "000"
        ElseIf rowItem = "DateTime" Then
            vRow("Result") = "001"
        ElseIf rowItem = "String" Then
            vRow("Result") = "002"
        Else
            vRow("Result") = "N/A"
        End If
    Next

It's a lot faster when dim is oitside even if you want to use the if-then structure:

------->     Dim rowItem as string
             For Each vRow In vDTtemp.Rows
------->            rowitem= vRow("Item")
                If rowItem = "Time" Then
                    vRow("Result") = "000"
                ElseIf rowItem = "DateTime" Then
                    vRow("Result") = "001"
                ElseIf rowItem = "String" Then
                    vRow("Result") = "002"
                Else
                    vRow("Result") = "N/A"
                End If
            Next

I hope this be helpful for someone more ;)

Ivan Font
  • 73
  • 1
  • 4
2

If you really find this is your bottleneck in performance, you could try modifying the If..Then clause as follows to only access the indexer once:

For Each vRow In vDTtemp.Rows
    Dim rowItem = vRow("Item")
    If rowItem = "Time" Then
        vRow("Result") = "000"
    ElseIf rowItem = "DateTime" Then
        vRow("Result") = "001"
    ElseIf rowItem = "String" Then
        vRow("Result") = "002"
    Else
        vRow("Result") = "N/A"
    End If
Next

That being said, I suspect each of these is a case of over optimizing. The compiler should to the best thing here. If you check the IL from a long Select Case, you may find that it uses a string of If..then clauses under the covers with "goto" to escape the rest of the clauses. Your best option here is to get the most maintainable code as the performance benefits you may gain will be offset by the minimal boost you may see between if..then and select case.

Jim Wooley
  • 10,169
  • 1
  • 25
  • 43