0

Consider the scenario where you have a bunch of assets, and you have numbers associated to each type of assets. We are changing these numbers from set A to set B, so I am writing a script to populate some values for Set B in a new column in excel based on the original set A number. There are 110,000 items in each set.

Due to the information being scattered across a lot of sheets, I have adopted a VBA approach. My original code executed via a simple comparison of the strings:

Public Function SearchSAP(StkCd As Long) As Long
    Dim wb As Workbook
    Dim shSAP As Worksheet
    Dim i As Long

    ' SAP sheet name is fixed and does not change
    Set wb = ActiveWorkbook
    Set shSAP = wb.Worksheets("SAP")

    ' i is the start row of the SAP sheet for data
    i = 2

    ' Define no-match value as -1
    SearchSAP = -1        

    Do While i < shSAP.UsedRange.Rows.Count And i < 106212
        If shSAP.Cells(i, 1).value = Stkcd Then
            SearchSAP = shSAP.Cells(i, 2).value
            Exit Do
        End If

        i = i + 1
    Loop

    Set shSAP = Nothing
    Set wb = Nothing
End Function

This function took me forever to execute, probably closer to like 15-20 minutes on an i7 net 2.4 GHz core. I almost thought I have coded it wrong with an infinity loop. when it finally gave me "-1" did I realize that it really did take that long. Researching on stackoverflow, I found the post " How to optimize vlookup for high search count ? (alternatives to VLOOKUP) " which seems to indicate that dictionary is the way to go. So I tried that:

Public Function SearchSAP(StkCd As Long) As Long
    Dim wb As Workbook
    Dim shSAP As Worksheet
    Dim Dict As New Scripting.Dictionary
    Dim i As Long

    ' SAP sheet name is fixed and does not change
    Set wb = ActiveWorkbook
    Set shSAP = wb.Worksheets("SAP")

    ' i is the start row of the SAP sheet for data
    i = 2

    ' Define null value as -1
    SearchSAP = -1

    Do While i < shSAP.UsedRange.Rows.Count And i < 106212
        Dict.Add shSAP.Cells(i, 1).value, shSAP.Cells(i, 2).value
        i = i + 1
    Loop

    Do While i < shSAP.UsedRange.Rows.Count And i < 106212
        If Dict.Exists(StkCd) Then
            SearchSAP = Dict(StkCd)
            Exit Do
        End If

        i = i + 1

        If i = 150000 Then
            Debug.Print "Break"
        End If
    Loop

    Set shSAP = Nothing
    Set wb = Nothing
End Function

But this function still took a good 5 mins or so to figure itself out. My question then is, am I approaching this in a pretty dumb manner? How can I do this more efficiently? I am not a full time programmer, so I wasn't sure what I can do to optimize this. Any help would be great!

Community
  • 1
  • 1
Isa
  • 271
  • 1
  • 6
  • 16
  • 1
    Every time you call SearchSAP you're recreating the Dictionary: make that `Static` instead of just using Dim. – Tim Williams Jul 27 '16 at 19:17
  • @MacroMan Even though you may be correct that this question is off-topic, [please pick a bona fide reason for closing it on Stack Overflow](http://meta.stackoverflow.com/q/313266/1157100). "Belongs on _some other site_" is insufficient justification for closing a question. – 200_success Jul 27 '16 at 20:15
  • @MacroMan Even if the end result is the same, explaining your reasoning matters. Otherwise, the OP and other users have no idea _why_ you think it's off-topic, and your vote would appear to be purely based on your arbitrary opinion. – 200_success Jul 27 '16 at 20:44
  • Can the lookup ranges be sorted by column 1? this makes your search much quicker. You then use double vlookup technique, like pseudocode: `if(vlookup(stkCd, sortedTable, 1, true)=stkCd , then searchSap = vlookup(stkCd, sortedTable, 2, true)`. You can convert this to proper VBA, or a formula version. It checks that the stkCd is exactly the cell value, then if so looks up column 2. Using binary search(approximate match) speeds up the whole process. Maybe 16-20 steps to find an item using binary search(x2 because we are vlooking up twice), whereas exact match takes 20k if the item is on row 20K – MacroMarc Jul 28 '16 at 09:57
  • I did not know why Static would make a difference, but it did work. The times I have given above were only for one single execution of the function and there were no multiple calls in the function when I was testing this. Still don't really understand where the time savings came from. I am thinking it probably is more likely from using the .range approach instead of getting data cell by cell. Thanks all of you though! Appreciate it a lot! – Isa Jul 28 '16 at 16:34

1 Answers1

1
Public Function SearchSAP(StkCd As Long) As Long

    Static Dict As scripting.dictionary 'precerved between calls
    Dim i As Long, arr

    If Dict Is Nothing Then
        'create and populate dictionary
        Set Dict = New scripting.dictionary
        With ActiveWorkbook.Worksheets("SAP")
            arr = .Range(.Range("A2"), _
                         .Cells(.Rows.Count, "A").End(xlUp).Offset(0, 1)).Value
        End With
        For i = 1 To UBound(arr, 1)
            Dict.Add arr(i, 1), arr(i, 2)
        Next i
    End If

    If Dict.exists(cstr(StkCd)) Then
        SearchSAP = CLng(Dict(cstr(StkCd)))
    Else
        SearchSAP = -1
    End If

End Function
Isa
  • 271
  • 1
  • 6
  • 16
Tim Williams
  • 154,628
  • 8
  • 97
  • 125
  • 10 mins to <1 second. You guys are awesome! :> – Isa Jul 28 '16 at 16:29
  • Worth noting that if you edit the lookup range after the `Dict` variable has been initialized, you will need to reset the VBproject to clear the variable. – Tim Williams Jul 28 '16 at 17:17