1

I'm new to SO--I hope I'm posting this in the right place.

As an update to my post below, I've found one way of getting the City portion of the address. I don't really need the max value as suggested in my original post; I just need the last value because the "xyz:sequenceNumber" values are always, well, in sequence. So I tried this:

tmpCity = xNode.selectSingleNode("//abc:Person//xyz:Region").previousSibling.Text

and it seems to work, since I'm dealing with well-formatted .xml files in which the previousSibling of xyz:Region is always the last line of xyz:AddressText containing the City value that I'm looking for. I'd still appreciate any comments, because I remain in the dark as to whether this (and the code below) is even remotely efficient. I've got many large .xml files to shred, so efficiency matters.

I need to work with some obsolete VB6 code which includes a recursive XML shredding subroutine. I'm familiar with VB6, but I don't understand this subroutine. I've pasted a portion of the code below. I'm hoping someone can point me to some detailed background reading material that will help me figure out how this subroutine's XML handling aspects work so I can maintain and modify it. I've also pasted below a [sanitized] sample extract from two of the XML files that I need to work with. One problem is that the City portion of the address is stored in the last one of a series of enumerated attributes. To get the City, I need to extract the text from the attribute with the maximum xyz:sequenceNumber value. SO has many posts with examples of how to get the highest value attribute, but I can't get them to work in this subroutine. Typically, they seem to use either a max() function--which VB6 complains about when I try to use it in this subroutine; or they use something like you see in the snippet below, but when I try to adapt that VB6 complains about the double colon ("::").

doc.SelectSingleNode("//Employees/Employee/@Id[not(. <=../preceding-sibling::Employee/@id) and not(. <=../following-sibling::Employee/@Id)]");

I'm guessing that the examples I've seen pertain to different libraries than whatever is available in VB6.

Here's the XML sample:

<abc:Person>
  <xyz:LicenseNo>1234</xyz:LicenseNo>
  <xyz:Language xyz:languageCode="en">
    <xyz:Name>
      <xyz:Company xyz:languageCode="en">ABC Company Ltd.</xyz:Company>
    </xyz:Name>
    <xyz:AddressCollection>
      <xyz:Address>
        <xyz:SequencedAddress xyz:languageCode="en">
          <xyz:AddressText xyz:sequenceNumber="1">The ABC Building</xyz:AddressText>
          <xyz:AddressText xyz:sequenceNumber="2">123 Main Street</xyz:AddressText>
          <xyz:AddressText xyz:sequenceNumber="3">3rd Floor</xyz:AddressText>
          <xyz:AddressText xyz:sequenceNumber="4">Tampa</xyz:AddressText>
          <xyz:Region xyz:RegionCategory=“State”>FL</xyz:Region>
          <xyz:CountryCode>US</xyz:CountryCode>
          <xyz:ZipCode>33607</xyz:ZipCode>
        </xyz:SequencedAddress>
      </xyz:Address>
    </xyz:AddressCollection>
  </xyz:Language>
</abc:Person>
<abc:Person>
  <xyz:LicenseNo>567</xyz:LicenseNo>
  <xyz:Language xyz:languageCode="en">
    <xyz:Name>
      <xyz:Company xyz:languageCode="en">XYZ Industries Ltd.</xyz:Company>
    </xyz:Name>
    <xyz:AddressCollection>
      <xyz:Address>
        <xyz:SequencedAddress xyz:languageCode="en">
          <xyz:AddressText xyz:sequenceNumber="1">XYZ Factory Plaza</xyz:AddressText>
          <xyz:AddressText xyz:sequenceNumber="2">678 Elm Street</xyz:AddressText>
          <xyz:AddressText xyz:sequenceNumber="3">Orlando</xyz:AddressText>
          <xyz:Region xyz:RegionCategory=“State”>FL</xyz:Region>
          <xyz:CountryCode>US</xyz:CountryCode>
          <xyz:ZipCode>32814</xyz:ZipCode>
        </xyz:SequencedAddress>
      </xyz:Address>
    </xyz:AddressCollection>
  </xyz:Language>
</abc:Person>

Here's the code portion:

Public Sub ShredXML(ByRef Nodes As MSXML2.IXMLDOMNodeList)
Dim xNode As MSXML2.IXMLDOMNode

    For Each xNode In Nodes
            If xNode.nodeType = NODE_ELEMENT Then
                If xNode.nodeName = "abc:Person" Then
                tmpCompany = xNode.selectSingleNode("//abc:Person//xyz:Company").Text
                tmpLicenseNo = xNode.selectSingleNode("//abc:Person//xyz:LicenseNo").Text
                tmpLanguage = xNode.selectSingleNode("//abc:Person//xyz:Language").Attributes.getNamedItem("xyz:languageCode").Text
                tmpRegion = xNode.selectSingleNode("//abc:Person//xyz:Region").Text
                tmpCountryCode = xNode.selectSingleNode("//abc:Person//xyz:CountryCode").Text
                tmpZipCode = xNode.selectSingleNode("//abc:Person//xyz:ZipCode").Text

‘ database insert code omitted

                End If
            End If

        If xNode.hasChildNodes Then
            ShredXML xNode.childNodes
        End If

   Next xNode
BRW
  • 187
  • 1
  • 10

1 Answers1

0

I took your sample XML snippet, fixed the errors (wrong quotation marks used, “State” instead of "State"), then wrapped it within a document node:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<doc xmlns:abc="urn:abc" xmlns:xyz="urn:xyz">
:
</doc>

Saved that as sample.xml and ran this code:

Option Explicit

Private Sub WriteLine(Optional ByVal Text As String)
    With Text1
        .SelStart = &H7FFF 'End.
        If Len(Text) > 0 Then .SelText = Text
        .SelText = vbNewLine
    End With
End Sub

Private Sub Form_Load()
    Dim ParseSuccess As Boolean
    Dim Element1 As MSXML2.IXMLDOMElement
    Dim Element2 As MSXML2.IXMLDOMElement
    Dim LicenseNo As String
    Dim Company As String
    Dim Language As String
    Dim Region As String
    Dim CountryCode As String
    Dim ZipCode As String
    Dim MaxSeqVal As Long
    Dim MaxSeqElement As MSXML2.IXMLDOMElement
    Dim SeqVal As Long
    Dim City As String

    With New MSXML2.DOMDocument60
        ParseSuccess = .Load(App.Path & "\sample.xml")
        WriteLine "Parse success = " & CStr(ParseSuccess)
        If ParseSuccess Then
            WriteLine
            For Each Element1 In .getElementsByTagName("abc:Person")
                With Element1
                    LicenseNo = .getElementsByTagName("xyz:LicenseNo")(0).Text
                    Company = .getElementsByTagName("xyz:Company")(0).Text
                    Set Element2 = .getElementsByTagName("xyz:Language")(0)
                    Language = Element2.getAttribute("xyz:languageCode")
                    Region = .getElementsByTagName("xyz:Region")(0).Text
                    CountryCode = .getElementsByTagName("xyz:CountryCode")(0).Text
                    ZipCode = .getElementsByTagName("xyz:ZipCode")(0).Text
                    MaxSeqVal = -1
                    For Each Element2 In .getElementsByTagName("xyz:AddressText")
                        SeqVal = CLng(Element2.getAttribute("xyz:sequenceNumber"))
                        If SeqVal > MaxSeqVal Then
                            MaxSeqVal = SeqVal
                            Set MaxSeqElement = Element2
                        End If
                    Next
                    City = MaxSeqElement.Text
                End With
                WriteLine LicenseNo
                WriteLine Company
                WriteLine Language
                WriteLine Region
                WriteLine CountryCode
                WriteLine ZipCode
                WriteLine City
                WriteLine
            Next
        End If
    End With
End Sub

Note that the Node and Element interfaces have different members (properties and methods) which is why Element1 and Element2 are used here. That allows us to take a Node and query for its more useful Element interface.

Seemed to work fine:

Parse success = True

1234
ABC Company Ltd.
en
FL
US
33607
Tampa

567
XYZ Industries Ltd.
en
FL
US
32814
Orlando

MSXML doesn't support XPath 2 as far as I know, so no "max" operation is supported.

Bob77
  • 13,167
  • 1
  • 29
  • 37
  • If you *know* that `City` is always the last `xyz:AddressText` you could always just use the `IXMLDOMNodeList` returned by `getElementsByTagName()` and use its `.item(.length - 1).text` to get `City` values. – Bob77 Oct 31 '18 at 14:31
  • Thank you very much. This makes more sense to me than the "//" technique in the code that I posted. I'll try to adapt it into the code I'm working with and do some tests to see if it also improves speed. Thanks again. BRW – BRW Nov 03 '18 at 16:19