0

I have a procedure which can run very long. Yesterday it took 14 hours to complete. This piece of code is looping over the values of a column, which holds filenames of images, and searches through an array that holds all the files including the path from a location that the user selected. In this particular case, the filename column contained nearly 2600 filenames and the array to search more than 12000 records. (that's over 31 million iterations, any suggestions, if this can be improved, are welcome ;-))

In this procedure I use DoEvents to keep Excel responsive. But I just wondered if it makes sense to have two DoEvents. One in every loop (see code below). All the processing is done in this piece of code. Which in this case ran more than 14 hours.

 For Each cell In ActiveSheet.Range("A1:A" & Range("A1").End(xlDown).row)
        DoEvents
        fileCopied = False
        fileName = cell.Value

        If Not (IsStringEmpty(fileName)) Then
            DoEvents
            For i = LBound(imgArray) To UBound(imgArray)
                If Not (IsStringEmpty(CStr(imgArray(i)))) Then
                    If ExactMatch Then
                        If (fsoGetFileName(imgArray(i)) = fileName) Then
                            If DoesFileExist(moveToPath & GetFileName(imgArray(i))) And Not OverwriteExistingFile Then
                                FileCopy imgArray(i), moveToPath & GetFileName(imgArray(i)) & "-" & Format(Now, "yyyymmddhhmmss")
                            Else
                                FileCopy imgArray(i), moveToPath & GetFileName(imgArray(i))
                            End If
                            fileCopied = True

                            If fileCopied Then
                                If fileCopied Then
                                    Range("B" & cell.row).Value = imgArray(i)
                                End If
                            End If
                        End If
                    End If
                End If
            Next i
        End If
    Next

As you can see, I added two DoEvents. But if only one is enough what would be the best place to add it. In the main loop or in the nested loop.

UPDATE:

Rereading the article DoEvents and DoEvents (automateexcel) made clear not to use multiple DoEvents. DoEvents are necessary in this case due to the long-running procedure. But I don't call it on every iteration now. As suggested I use:

If i Mod 100 = 0 Then DoEvents

UPDATE:

Thanks to FreeFlow I was able to gain significant performance improvements. By using the filter function available instead of looping over the Array which contained more than 12000 records. Using the filter function, speeded the process up from hours to seconds.

UPDATE:

The end result is:

 fileNameString = GetFilesUsingCMD(filePath)

If Not (IsStringEmpty(fileNameString)) Then
    Dim imgArray As Variant: imgArray = Split(fileNameString, "|")
    rowCount = ActiveSheet.Range("A:A").Cells.SpecialCells(xlCellTypeConstants).Count

    fileNameArray = Application.Transpose(ActiveSheet.Range("A:A"))
    activeRow = 0

    For fn = LBound(fileNameArray) To UBound(fileNameArray)
        fileName = fileNameArray(fn)

        If Not (IsStringEmpty(fileName)) Then
            If fn Mod 10 = 0 Then
                Progress.Update fn, rowCount, "(Nr. of files:" & CStr(UBound(imgArray)) & ") Executing time: " & CStr(Format((Timer - StartTime) / 86400, "hh:mm:ss")), fileName, True
                DoEvents
            End If

            If Not ExactMatch Then
                resultArray = Filter(imgArray, fileName, True, vbTextCompare)
            Else
                resultArray = Filter(imgArray, fileName)
            End If

            If (UBound(resultArray) > -1) Then

                For i = LBound(resultArray) To UBound(resultArray)

                    If Not OverwriteExistingFile Then
                        If i = 0 Then
                            newFileName = GetFileName(resultArray(i))
                        Else
                            newFileName = CreateFileName(GetFileName(resultArray(i)), CStr(i))
                        End If
                    Else
                        newFileName = GetFileName(resultArray(i))
                    End If
                    FileCopy resultArray(i), moveToPath & newFileName

                    If Not OrgLocationAsLink Then
                        ActiveSheet.Cells(fn, i + 2).Value = imgArray(i) & " (" & newFileName & ")"
                    Else
                        ActiveSheet.Hyperlinks.Add ActiveSheet.Cells(fn, i + 2), Address:=resultArray(i)
                    End If

                Next i

            Else
                ActiveSheet.Range("B" & fn).Value = "** NOT Available **"
                ActiveSheet.Range("B" & fn).Font.Color = RGB(250, 0, 0)
            End If
        End If
    Next fn
End If

As said, because of the Filter-function (Filter Function) I could get rid of the nested loop which iterated over 12000 times for each row on the sheet.

Stephan
  • 463
  • 2
  • 7
  • 22
  • 1
    What is it does that takes 14 hours? I think it'd make sense to try to get the processing time down. – Enigmativity Nov 02 '19 at 09:11
  • If you are copying files, that is probably what is taking a long time. `DoEvents` will make Excel responsive, but it also slows down the whole process, so if you are trying to speed it up, dont use any `DoEvents` – braX Nov 02 '19 at 09:27
  • agree with @braX: you do not need DoEvents here. – Siddharth Rout Nov 02 '19 at 16:24
  • 1
    Also if there is only one match then you may want to exit the loop rather than looping through the whole array? An `Exit For` inside `If ExactMatch Then`? – Siddharth Rout Nov 02 '19 at 16:25
  • @SiddharthRout There could be multiple matches. So I couldn't use the exit For. And Is certainly need DoEvents because there is progress bar which updates the statusbar on a regular base. But the changes I made, improved the performance – Stephan Nov 03 '19 at 07:19

2 Answers2

1

One or more do events will not solve the basic problem. There are a number of optimisations you can make which will speed up things immensely.

  1. Copy excel ranges to VBA arrays (or other collection object) so that you don't do multiple accesses to Excel.

  2. Get directory listings from your target destinations, convert the text to an array or collection object and work with that rather than multiple disk accesses to get individual filenames.

  3. Use ArrayLists and Scripting.Dictionaries (collection objects) so you can use the contains or exists methods to avoid doing specific If then comparisons.

  4. Don't do individual disk copies. Create a list of copy/move instructions that can be run as a shell script when you have processed all your data.

freeflow
  • 4,129
  • 3
  • 10
  • 18
  • thanks. 1. I will try that. 2. I already doing that. The whole directory and subdirectories are read using wsh.exec("cmd /c dir """ & rootPath & """ /b /s").StdOut.ReadAll (this is super fast). The result in converted to an Array. 3. and 4. I will look into that as well. Usefull info. – Stephan Nov 02 '19 at 11:09
  • ++ Good Suggestions. I will add one more to this. Avoid multiple use of `GetFileName(imgArray(i))`. it should be used only once.. right after `If ExactMatch Then` and stored in a variable. Constantly calling the function `GetFileName` will slow the code down. – Siddharth Rout Nov 02 '19 at 16:23
  • @SiddharthRout Good one too, but because I rewrote the code I could get rite of the GetFileName part. – Stephan Nov 03 '19 at 07:09
0

I would remove the DoEvents in the main loop, and remain the nested loop one.

By the way, I will add Application.ScreenUpdating = False at the beginning of Sub.

The post below could be helpful.

https://wellsr.com/vba/2018/excel/vba-doevents-and-when-to-use-it/

PaichengWu
  • 2,649
  • 1
  • 14
  • 28