1

I have been crashing my head trying to create a routine to identify a string in a TXT and copy that information to an excel sheet. This is the content in my test TXT file:

LIN+1++7501005111133:EN'
PIA+1+008112338:IN+.:VN'
PRI+AAB:760.73::EUP::EA'
PAC+1+3'
LIN+2++7501024201969:EN'
PIA+1+008126016:IN+.:VN'
PRI+AAB:732.07::EUP::EA'
PAC+1+3'
LIN+3++7501024201976:EN'
PIA+1+008126023:IN+.:VN'
PRI+AAB:710.86::EUP::EA'
PAC+1+3'
LIN+4++7501005114103:EN'
PIA+1+008126289:IN+.:VN'
PRI+AAB:737.91::EUP::EA'
PAC+1+3'
LIN+5++7501005113960:EN'
PIA+1+008126310:IN+.:VN'
PRI+AAB:737.91::EUP::EA'
PAC+1+3'

What I need to extract is for example all the lines starting with "PIA+1". In that case I should have in excel a column with this result:

PIA+1+008112338:IN+.:VN'
PIA+1+008126016:IN+.:VN'
PIA+1+008126023:IN+.:VN'
PIA+1+008126289:IN+.:VN'
PIA+1+008126310:IN+.:VN'

The thing is that I would like to have a process that I can reuse for other segments in the file, for example "LIN+" or others. I have created this code, but it's only retrieving me the first match:

Sub Extract_EDI_Data_2()

Dim wrk As Workbook
Dim shtSource As Worksheet
Dim shtResult As Worksheet
Dim rng As Range
Dim fndSection As Range
Dim fndValue As Range
Dim data As QueryTable
Dim strFile
Dim strPath As String
Dim strExt As String
Dim strSection As String
Dim strValue As String

    ThisWorkbook.Sheets("EDI_Data").Range("A2:AI100000").ClearContents

    ' ======== BEGIN SETTINGS ========
    strPath = "C:\Edicom\Input\"
    strExt = "*.EDI"

    strSection = "LIN+1++"
    strValue = "LIN+1++"
    ' ======== END SETTINGS ========

    Set wrk = Application.ThisWorkbook
    With wrk
        Set shtResult = ThisWorkbook.Worksheets("EDI_Data_Item")
        Set shtSource = .Worksheets.Add
    End With

    With shtResult
        .Cells(1, 2).Value = strValue
        .Name = "EDI_Data_Item"
    End With

    strFile = Dir(strPath & strExt, vbNormal)

    Do Until strFile = ""
        Set data = shtSource.QueryTables.Add(Connection:="TEXT;" & strPath & strFile, Destination:=shtSource.Cells(1, 2))
        With data
            .TextFileStartRow = 1
            .TextFileParseType = xlDelimited
            .TextFileConsecutiveDelimiter = True
            .TextFileOtherDelimiter = True
            .TextFileOtherDelimiter = "'"
            .Refresh BackgroundQuery:=True
        End With

        Set fndSection = data.ResultRange.Find(strSection)
        If Not fndSection Is Nothing Then
            Set fndValue = data.ResultRange.Find(strValue, fndSection)
            If Not fndValue Is Nothing Then
                shtResult.Cells(shtResult.Rows.Count, 2).End(xlUp).Offset(1).Value = Replace(fndValue, strValue, "")
            End If
        End If

        With data
            .ResultRange.Delete
            .Delete
        End With

        strFile = Dir
    Loop

    Application.DisplayAlerts = False
    shtSource.Delete
    Application.DisplayAlerts = True

End Sub

Any ideas to solve this puzzle?

Thanks for the support.

Regards

Rodrigo A.
  • 63
  • 8
  • 3
    Sounds like a job for [regular expressions](https://stackoverflow.com/q/22542834/4088852). – Comintern Jul 30 '18 at 16:20
  • 4
    The simplest thing to do is to open it in memory, split it to an array of lines & loop looking for lines that begin with PIA+1, is there a reason for doing this with a Query Table instead? – Alex K. Jul 30 '18 at 16:22
  • Are you missing a `Do While` loop entry? I'm getting the error `Loop without Do` Also, you could just copy the whole file to Excel, sort the names, then just have the macro look for where the first three/four letters differ and separate that way? – BruceWayne Jul 30 '18 at 16:22
  • @BruceWayne sorry, I have updated my code and shared it again. – Rodrigo A. Jul 30 '18 at 16:27
  • @AlexK. Alex thanks for your comment, this is the only way i found to create the code, I'm not a VBA but I tried this way! – Rodrigo A. Jul 30 '18 at 16:28

1 Answers1

1

Try replacing...

Set fndSection = data.ResultRange.Find(strSection)
If Not fndSection Is Nothing Then
    Set fndValue = data.ResultRange.Find(strValue, fndSection)
    If Not fndValue Is Nothing Then
        shtResult.Cells(shtResult.Rows.Count, 2).End(xlUp).Offset(1).Value = Replace(fndValue, strValue, "")
    End If
End If

with

Set fndValue = data.ResultRange.Find(strValue)
If Not fndValue Is Nothing Then
    strFirstAddress = fndValue.Address
    Do
        shtResult.Cells(shtResult.Rows.Count, 2).End(xlUp).Offset(1).Value = Replace(fndValue, strValue, "")
        Set fndValue = data.ResultRange.FindNext(fndValue)
    Loop While fndValue.Address <> strFirstAddress
End If

Actually, your code can be re-written as follows...

Option Explicit

Sub Extract_EDI_Data_2()

Dim shtSource As Worksheet
Dim shtResult As Worksheet
Dim fndValue As Range
Dim data As QueryTable
Dim strFile
Dim strPath As String
Dim strExt As String
Dim strValue As String
Dim strFirstAddress As String

    Application.ScreenUpdating = False

    ThisWorkbook.Sheets("EDI_Data_Item").Range("A2:AI100000").ClearContents

    ' ======== BEGIN SETTINGS ========
    strPath = "C:\Edicom\Input\"
    strExt = "*.EDI"

    strValue = "PIA+1"
    ' ======== END SETTINGS ========

    With ThisWorkbook
        Set shtResult = .Worksheets("EDI_Data_Item")
        Set shtSource = .Worksheets.Add
    End With

    With shtResult
        .Cells(1, 2).Value = strValue
        .Name = "EDI_Data_Item"
    End With

    strFile = Dir(strPath & strExt, vbNormal)

    Do Until strFile = ""
        Set data = shtSource.QueryTables.Add(Connection:="TEXT;" & strPath & strFile, Destination:=shtSource.Cells(1, 2))
        With data
            .TextFileStartRow = 1
            .TextFileParseType = xlDelimited
            .TextFileConsecutiveDelimiter = True
            .TextFileOtherDelimiter = True
            .TextFileOtherDelimiter = "'"
            .Refresh BackgroundQuery:=True
        End With

        Set fndValue = data.ResultRange.Find(strValue)
        If Not fndValue Is Nothing Then
            strFirstAddress = fndValue.Address
            Do
                shtResult.Cells(shtResult.Rows.Count, 2).End(xlUp).Offset(1).Value = Replace(fndValue, strValue, "")
                Set fndValue = data.ResultRange.FindNext(fndValue)
            Loop While fndValue.Address <> strFirstAddress
        End If

        With data
            .ResultRange.Delete
            .Delete
        End With

        strFile = Dir
    Loop

    Application.DisplayAlerts = False
    shtSource.Delete
    Application.DisplayAlerts = True

    Application.ScreenUpdating = True

End Sub

You'll notice that the Option Explicit statement is included at the top of the module. This forces the explict declaration of variables, and can help catch potential errors. Also, ScreenUpdating is turned off at the beginning of the code, and turned back on at the end. This should make the code a bit more efficient. Also, I assumed that you meant to clear the contents for the sheet called EDI_Data_Item, not EDI_Data.

Domenic
  • 7,844
  • 2
  • 9
  • 17