1

Which of the following loops is faster? I've read all sorts of stuff on the net, including many things here on Stack Overflow, and I'm still not sure what the answer really is for .net code? Is there some auto optimization by the byte code compiler in .net? I found a similar post but about Java here. Efficiency of nested Loop

For n = 1 to 1000
   For m = 1 to 2000
       A(n,m) = b(n,m)
 Next m
Next n

Or, switching the order around:

For m = 1 to 2000
   For n = 1 to 1000
       A(n,m) = b(n,m)
 Next n
Next m 

Is one is faster because of the order it goes through in memory…. If so which?

Community
  • 1
  • 1
GetFuzzy
  • 2,116
  • 3
  • 26
  • 42
  • Well, it might make a difference, I suppose, but we'd have to see the definition of `A(int, int)`. Practically speaking, you shouldn't worry about it. – Michael Petrotta Sep 06 '11 at 03:04
  • 1
    Because CLI arrays are row-major and spatial locality matters a lot on modern processors, the first snippet will be much faster. – Rick Sladkey Sep 06 '11 at 03:07

2 Answers2

1

I had my TickTimer class around so I decided to give it a try. I had to increase the size of the arrays to notice a difference.
Test it for yourselves. The first one is indeed faster.

Module Module1

    Sub Main()
        Dim A(10000, 20000) As Int16
        Dim b(10000, 20000) As Int16

        For n = 1 To 10000
            For m = 1 To 20000
                A(n, m) = 1
                b(n, m) = 1
            Next m
        Next n

        Dim firstTick As TickTimer = New TickTimer()
        For n = 1 To 10000
            For m = 1 To 20000
                A(n, m) = b(n, m)
            Next m
        Next n
        Console.WriteLine(firstTick.DeltaSeconds(""))

        Dim secondTick As TickTimer = New TickTimer()
        For m = 1 To 20000
            For n = 1 To 10000
                A(n, m) = b(n, m)
            Next n
        Next m
        Console.WriteLine(secondTick.DeltaSeconds(""))

        Console.ReadKey()

    End Sub

End Module


Public Class TickTimer
    Public currentTicks As Long
    Public lastTicks As Long = System.DateTime.Now.Ticks
    Public retVal As String
    ''' <summary>
    ''' Calculates the seconds it took since the class was instantiated until this method
    ''' is first invoked and for subsequent calls since the previous time the method was called
    ''' </summary>
    ''' <param name="message">Message (e.g. "The last query took ")</param>
    ''' <returns>The passed string followed by the seconds: "          The last query took,     0.3456"</returns>
    ''' <remarks>To see how long it takes a method to execute instantiate this class at its
    ''' very begining and call this method just before it returns; Log the result with     Debug.Writeln or something similar</remarks>
    Public Function DeltaSeconds(ByVal message As String) As String
        currentTicks = System.DateTime.Now.Ticks
        retVal = String.Format("{0}, {1}", message.PadLeft(100), ((currentTicks - lastTicks) /     TimeSpan.TicksPerSecond).ToString().PadRight(15))
        lastTicks = currentTicks
        Return retVal
    End Function
End Class
Mircea Ion
  • 658
  • 5
  • 20
0

Rick Sladkey perfectly hit the point! Previous tests both scroll the same array first "by rows" then "by columns". Below is the correct test, which does the same operations on an array and on its transpose. In this case we can see that actually there is a cost on the overhead of the For call and the second method is slightly quicker. This is definitely unnoticed if we scroll the array "by columns", which yields a computation time of a greater order of magnitude. I do not know details about CLI arrays and why this is the behaviour of processors but this is the evidence.

Structure mt
    Dim m() As Double
End Structure

Sub Main()

    Dim a, b As Integer
    Dim p As Integer = 10000000
    Dim q As Integer = 5
    Dim m1(p, q) As Double
    Dim m4(q, p) As Double
    'Dim m2()() As Double
    'Dim m3() As mt

    'ReDim m2(p)
    'For a = 1 To p
    '    ReDim m2(a)(q)
    'Next

    'ReDim m3(p)
    'For a = 1 To p
    '    ReDim m3(a).m(q)
    'Next

    Dim sw As New Stopwatch

    sw.Restart()
    For a = 1 To p
        For b = 1 To q
            m1(a, b) = 0
            'm2(a)(b) = 0
            'm3(a).m(b) = 0
        Next
    Next
    sw.Stop()

    Console.WriteLine("Small loop in large loop: " & sw.Elapsed.ToString())

    sw.Restart()
    For a = 1 To q
        For b = 1 To p
            'm1(b, a) = 0
            'm2(b)(a) = 0
            'm3(b).m(a) = 0
            m4(a, b) = 0
        Next
    Next
    sw.Stop()

    Console.WriteLine("Large loop in small loop: " & sw.Elapsed.ToString())
    Stop
End Sub
Brian Webster
  • 30,033
  • 48
  • 152
  • 225
  • For the large (p=10000000) case a couple of days ago I must have had some other process(es) forcing paging in this one. The output was: Small loop in large loop: 00:02:27.1539801; Large loop in small loop: 01:12:11.9496944. So even if the smaller caches didn't cause an issue (which they do), if you have enough memory requested that some of it is paged out during every outer loop you'll definitely notice a big difference in performance. – Mark Hurd Oct 09 '12 at 08:58