Does it make sense that a VB6 program that processes ~50,000 xml files runs about 3x faster [in terms of number of files processed per second] if the files are each about 30KB in size, than if the files are each about 4KB in size? If it does make sense, how can I speed up processing of the smaller files?
The program reads each file, computes an MD5 hash value for the file and calls a SQL Server stored procedure to see if a version of the file having the same hash value is already stored in a database. If the file's computed hash value is already stored in the database the file isn't processed further; the program just repeats with the next file.
I've been testing on batches of ~50,000 xml files that have already been processed into the database, so the program just loops: get the next xml file→hash it→call Sproc→repeat.
I expected the program to run faster on similar-sized batches of smaller files, but it's significantly slower.
The program runs on a 64-bit Windows 10 workstation. The files are stored in a single directory on the workstation's C:\ drive (which is an SSD). SQL Server runs on a VM under Windows Server.
EDIT:
I think I have found the bottleneck, but I don't know how to solve it. The relevant piece of code is below. The bottleneck is caused by the xDOC.async = False
statement. If I remove it I get an immediate 20x speed improvement. BUT removing it causes document load failure errors since the code apparently can't handle asynchronous file loading. Can this be speeded up?
Dim objFSO As Object
Dim objFolder As Object
Dim objFile As Object
Dim xDOC As MSXML2.DOMDocument
Dim xPE As MSXML2.IXMLDOMParseError
Set objFolder = objFSO.GetFolder("C:\These are my XML Files")
For Each objFile In objFolder.Files
Set xDOC = New DOMDocument
xDOC.async = False ("THIS LINE IS THE PROBLEM")
If xDOC.Load(objFile.Path) Then
/* process the file */
Else
Set xPE = xDOC.parseError
With xPE
/* set up objFile.Name failed to load error message */
End With
/* log error details */
Set xPE = Nothing
End If
Set xDOC = Nothing
Next objFile