15

My project requires a bunch of dynamically-resizable arrays for different objects. An array may hold any number of objects, potentially thousands, of a single class, but not objects of multiple classes.

Mostly I will be iterating through arrays, thus use of a keyed collection is not ideal. I think I have two options:

The first option is to develop a 'List' class for each object type, with methods for adding objects (and expanding the array), getting the First and Last indexes and the object count, and retrieving an object by index (the latter 4 will include error handling in case the array is empty).

The second option is to develop a single 'List' class, with the same methods, using the Variant data type. Obviously this is a lot less work, but I am concerned about speed. How much slower is it to use variants than typed objects? Note that I will always be casting the variant objects in the array directly to a typed variable upon retrieval, a la:

Dim myObject As MyClass
Set myObject = variantList.Get(i)

Does casting improve speed, or does vba still have to perform all the type-checking associated with variants?

Also, would this second option be faster than using a non-keyed Collection? I have read that Collection iteration is slow, that they are designed for lookup. Does this apply to non-keyed collections, or only key-value-mapped collections?

Thanks to anyone who can offer advice.

Swiftslide
  • 1,307
  • 7
  • 23
  • 34
  • how often are you resizing them? VBA allows for dynamic array resizing via `redim preserve` if this does not happen often and/or you know the size of the new array, and would let you keep using an array (rather than a collection). – enderland Aug 31 '12 at 23:43
  • 1
    Whether or not any differences are significant would likely depend on your exact use case. Seems like it would be very easy to test though: have you run any comparisons? – Tim Williams Sep 01 '12 at 00:10

1 Answers1

23

I followed Tim Williams's advice and did some speed tests.

For each type of collection/array, I first added 100,000 objects of class "SpeedTester", which was simply a shell object holding a long variable (with get/set properties). The value of the variable was the value of the loop index (between 1 and 100,000)

Then I did a second loop, which involved accessing each object in the collection/array and assigning the object's long property value to a new variable of type long. I performed 3 rounds per method, and averaged the times for the And and get loops.

The results are as follows:

Method                      Avg Add Time    Avg Get Time    Total Time
Collection Indexed             0.305          25.498         25.803
Collection Mapped              1.021           0.320          1.342
Collection Indexed For Each    0.334           0.033          0.367
Collection Mapped For Each     1.084           0.039          1.123
Dynamic Array Typed            0.303           0.039          0.342
Static Array Typed             0.251           0.016          0.266

The methods Collection Indexed and Collection Mapped involved holding the objects in a collection. The first were added with no key, the second was added with a key which was the object's long property converted into a string. These objects were then accessed in a for-loop using an index from 1 to c.Count

The next two methods were identical to the first two in the way variables were added to the collection. However, for the Get loop, instead of using a for-loop with an index, I used a for-each loop.

Dynamic array typed was a custom class containing an array of type SpeedTester. Each time a variable is added, the size of the array was expanded by 1 slot (using ReDim Preserve). The get-loop was a for-loop using an index from 1 to 100,000, as is typical for an array.

Finally the static array typed was simply an array of type SpeedTester, which was initialised with 100,000 slots. Obviously this is the fastest method. Strangely enough, much of its speed gains were in Getting rather than Adding. I would have assumed that adding would be slower for the other methods, due to the need for resizing, while Getting each object would be no faster than a dynamic array.

I was astounded by the difference between using a for-loop and a for-each loop to access an indexed collection's objects. I was also suprised by the mapped collection's key lookup speed - much, much faster than indexing and comparable to all other methods except the static array.

In short, they're all viable alternatives for my project (except for the 1st and last methods, first because of its slowness, last because I need dynamically resizable arrays). I know absolutely nothing about how the collections are actually implemented, or the implementation differences between a dynamic and static array. Any further insight would be much appreciated.

EDIT: The code for the test itself (using the dynamic array)

Public Sub TestSpeed()
    Dim ts As Double
    ts = Timer()

    Dim c As TesterList
    Set c = New TesterList

    Dim aTester As SpeedTester

    Dim i As Long
    For i = 1 To 100000
        Set aTester = New SpeedTester
        aTester.Number = i

        Call c.Add(aTester)
    Next i

    Dim taa As Double
    taa = Timer()

    For i = c.FirstIndex To c.LastIndex
        Set aTester = c.Item(i)

        Dim n As Long
        n = aTester.Number
    Next i

    Dim tag As Double
    tag = Timer()

    MsgBox "Time to add: " & (taa - ts) & vbNewLine & "Time to get: " & (tag - taa)
End Sub

And for the dynamic array class TesterList:

Private fTesters() As SpeedTester

Public Property Get FirstIndex() As Long
    On Error GoTo Leave

    FirstIndex = LBound(fTesters)

Leave:
    On Error GoTo 0
End Property

Public Property Get LastIndex() As Long
    On Error GoTo Leave

    LastIndex = UBound(fTesters)

Leave:
    On Error GoTo 0
End Property

Public Sub Add(pTester As SpeedTester)
    On Error Resume Next

    ReDim Preserve fTesters(1 To UBound(fTesters) + 1) As SpeedTester
    If Err.Number <> 0 Then
        ReDim fTesters(1 To 1) As SpeedTester
    End If

    Set fTesters(UBound(fTesters)) = pTester

    On Error GoTo 0
End Sub

Public Function Item(i As Long) As SpeedTester
    On Error GoTo Leave

    Set Item = fTesters(i)

Leave:
    On Error GoTo 0
End Function

And finally, the very simple SpeedTester object class:

Private fNumber As Long

Public Property Get Number() As Long
    Number = fNumber
End Property

Public Property Let Number(pNumber As Long)
    fNumber = pNumber
End Property
Swiftslide
  • 1,307
  • 7
  • 23
  • 34
  • 1
    Can you post this code? I am really surprised how little loss there is on a dynamic array resizing each iteration vs a static sized array! I might try this with an object type which is considerably larger than just single `long` datatypes. – enderland Sep 01 '12 at 19:35
  • 1
    +1 for taking the time to do a thorough test of the different approaches. Seems like all of those methods (maybe with the exception of the indexed collection) should be ok for your project, since I got the impression you'd be working with low thousands of objects? – Tim Williams Sep 02 '12 at 02:17
  • Tim, yes my project only needs low-thousands of objects, however I thought it best to be thorough. – Swiftslide Sep 02 '12 at 05:23
  • 1
    Though this is an old thread, I recently had a low speed problem when using nested for loop "collection" (both outer and inner for loop are collection) which has both 100,000 items. The above structure takes me 1 minute to loop through a 100,000-items collection then finding the inner collection items by key. Later I change the the structure into a nested loop with outer loop as array and keep inner loop the same as the original mapped collection. It surprise me that now it only takes 1 second. Does anyone know why? The table in this thread implies that there should be no diff @@. – cindy50633 Jun 13 '22 at 02:44