1

I'm interested in using VBA/VBscript to parse email header data. There are other answers on this site that address this to some degree, however not to the extent that I need.

I have a number of message headers that I've extracted from emails and have saved as text files, all in one folder (see image below). I would like to loop through the folder and parse the area identified in the box (perhaps save to excel or a table in Access?). The data identified in the box shows all the email "hops" (when an email is sent it is transferred between many computers - each transfer is a "hop"). This data is found in the "Received: from" section highlighted below:

*NOTE: Apologies, I'm not at a reputation as of yet to post images:

https://msdnshared.blob.core.windows.net/media/TNBlogsFS/prod.evol.blogs.technet.com/CommunityServer.Blogs.Components.WeblogFiles/00/00/00/76/18/3782.HSG-8-18-11-1.jpg

The result should look like this:

https://msdnshared.blob.core.windows.net/media/TNBlogsFS/prod.evol.blogs.technet.com/CommunityServer.Blogs.Components.WeblogFiles/00/00/00/76/18/7624.hsg-8-19-11-1.png

This is accomplished by parsing the FROM, BY, WITH, and DATESTAMP information from the above boxed area of the message header.

Kind of a tall order, I know. But I can't seem to find anything online. Any assistance (or direction to other solutions) would be appreciated.

Thank you.

Craig
  • 145
  • 1
  • 12
  • What *exactly* are you having a problem with? Reading a text file from disk? Identifying the lines of interest? Splitting the lines into your required fields ? – Tim Williams May 10 '17 at 17:47
  • The latter. The lines of interest are embedded in the "Received: From" section of the header. If you look at this link, you'll see the code for accomplishing what I would like to do (however, it is done with Powershell, not VBA): https://gallery.technet.microsoft.com/office/8c15881d-c10f-4309-9900-4ff0653987a5 – Craig May 10 '17 at 18:14

1 Answers1

0

Try this code and let me know if it works for you or not. You just need to provide the path of the folder containing all your text files. The code would store the output in a file"1.txt" within the same folder. All the values in the file will be separated by double-pipe(||). Ofcourse, you can change it later as per your requirement.

Dim objFso, strFolderPath, objFolder, file, workFile, tempArr, strAllData
Dim strRecordData, arrRecordData
strFolderPath = "C:\Users\gu.singh\Desktop\Desktop\Gurman\2017\5. May\aa"   'REPLACE THIS PATH WITH YOUR FOLDER PATH
Set objFso = CreateObject("Scripting.FileSystemObject")
If objFso.FolderExists(strFolderPath) Then
    Set objFolder = objFso.GetFolder(strFolderPath)
    For Each file In objFolder.Files
        tempArr = Split(file.Name,".")
        If StrComp(tempArr(UBound(tempArr)),"txt",1)=0 Then
            Set workFile = file.OpenAsTextStream(1)
            strAllData = workFile.ReadAll()
            workFile.Close
            arrRecordData = Split(strAllData,"Received:")
            For i=1 To UBound(arrRecordData) Step 1
                intby = InStr(1,arrRecordData(i),"by ",1)
                intfrom = InStr(1,arrRecordData(i),"from ",1)
                intwith = InStr(1,arrRecordData(i),"with ",1)
                intsemi = InStr(1,arrRecordData(i),";",1)
                intdash = InStr(1,arrRecordData(i),"-",1)
                strFROM = Trim(Mid(arrRecordData(i),intfrom+Len("from "), intby-intfrom-Len("from ")))
                strBY = Trim(Mid(arrRecordData(i),intby+Len("by "), intwith-intby-Len("by ")))
                strWITH = Trim(Mid(arrRecordData(i),intwith+Len("with "), intsemi-intwith-Len("with ")))
                strDATE = Trim(Mid(arrRecordData(i),intsemi+Len(";"), intdash-intsemi-Len(";")))
                strResult = strResult & strBY &"||"&strFROM&"||"&strWITH&"||"&strDATE&vbCrLf
            Next

            Set WorkFile = Nothing
        End If
    Next
    Set objFolder = Nothing
End If
Set fyl = objFso.OpenTextFile(strFolderPath&"\1.txt",2,True)
fyl.Write strResult
fyl.Close
Set fyl=Nothing
Set objFso = Nothing
Gurmanjot Singh
  • 10,224
  • 2
  • 19
  • 43
  • Thanks, Gman. I'm still testing...I'm noticing a number of variations, however, in the header text of the emails I processed, so I may need to do some one-off scripting to account for these (i.e., a "from" tag may be followed by 1 or more "by" tags). – Craig May 12 '17 at 00:20
  • @Craig Ok. But even multiple "by" thing can be handled by a small tweak in the code. – Gurmanjot Singh May 12 '17 at 02:10
  • Thanks. The issue I have is that the above example (1st link that I provided) shows a header example that produces the 2nd link (there are actually 4 hops embedded in the paragraph). Does your code create a similar result for each email header passed to it? It's almost as if we need a "do loop" that loops through each paragraph string and carves out each hop. – Craig May 12 '17 at 09:00
  • @Craig To handle that I have used a split statement in the code. It should work if the number of times the word "Received: " is encountered in the text file = No. of hops you are referring to. – Gurmanjot Singh May 12 '17 at 09:08