0

I have a completed .pdf form (XFDF, XML) that I am importing into excel. The goal is to get a "Field Name" and a "Value" for that field from the XML.

Currently I am able to get the "Value" (Blue accent in picture below) by using a simple command such as:

 list(4).text        

I am struggling on how to store the "Field Name". In the below picture it is the expression/value highlighted in yellow.

I have tried variations of the following but can't get it to work.

List(4).Attributes(1).value

Any tips on how to access the nested value?

List -> Item(4) -> Attributes -> Item(1) -> value

Code and XML List Image

XFDF RAW:

<?xml version="1.0" encoding="utf-16"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
  <fields>
    <field name="3C_AQS_1">
      <value>2</value>
    </field>
    <field name="3C_AQS_2">
      <value>3</value>
    </field>
    <field name="3C_AQS_3">
      <value>1</value>
    </field>
    <field name="3C_AYS_1">
      <value>D:23440101</value>
    </field>
    <field name="3C_AYS_2">
      <value>D:13240101</value>
    </field>
    <field name="3C_AYS_3">
      <value>D:12340101</value>
    </field>
    <field name="3C_EEL_1">
      <value>2</value>
    </field>
    <field name="3C_EEL_2">
      <value>5</value>
    </field>
    <field name="3C_EEL_3">
      <value>2</value>
    </field>
    <field name="3C_JobProfile_1">
      <value>1235</value>
    </field>
    <field name="3C_JobProfile_2">
      <value>547</value>
    </field>
    <field name="3C_JobProfile_3">
      <value>1234 fd</value>
    </field>
    <field name="4_PME_DBT">
      <value>yuktyu</value>
    </field>
    <field name="4_PME_Launch">
      <value>ew4tw</value>
    </field>
    <field name="4_PME_Scope">
      <value>serg</value>
    </field>
    <field name="4_PPT_End">
      <value>D:19930714</value>
    </field>
    <field name="4_PPT_Start">
      <value>D:19001201</value>
    </field>
    <field name="Capital_LifeAmount">
      <value>5352</value>
    </field>
    <field name="Capital_Yr">
      <value>1</value>
    </field>
    <field name="Capital_YrAmount">
      <value>123124</value>
    </field>
    <field name="Capital_YrAmountP">
      <value>234</value>
    </field>
    <field name="CostConfidence">
      <value>Med</value>
    </field>
    <field name="Devliverables_1">
      <value>Delev 1</value>
    </field>
    <field name="Devliverables_2">
      <value>Delev 2</value>
    </field>
    <field name="Devliverables_3">
      <value>Delev 3</value>
    </field>
    <field name="Devliverables_4">
      <value>Delev 4</value>
    </field>
    <field name="Devliverables_5">
      <value>Delev 5</value>
    </field>
    <field name="HR Initiative">
      <value>HrInit</value>
    </field>
    <field name="InScope_1">
      <value>InScope 1</value>
    </field>
    <field name="InScope_2">
      <value>InScope 2</value>
    </field>
    <field name="InScope_3">
      <value>InScope 3</value>
    </field>
    <field name="InScope_4">
      <value>InScope 4</value>
    </field>
    <field name="InScope_5">
      <value>InScope 5</value>
    </field>
    <field name="In_HR_Area_1">
      <value>University Relations</value>
    </field>
    <field name="In_HR_Area_2">
      <value>Talent Development</value>
    </field>
    <field name="In_HR_Area_3">
      <value>Change Management</value>
    </field>
    <field name="In_HR_Area_4">
      <value>Global Compensation </value>
    </field>
    <field name="In_HR_Area_5">
      <value>Career Framework</value>
    </field>
    <field name="In_HR_NumEmp_1">
      <value>1</value>
    </field>
    <field name="In_HR_NumEmp_2">
      <value>42345</value>
    </field>
    <field name="In_HR_NumEmp_3">
      <value>12</value>
    </field>
    <field name="In_HR_NumEmp_4">
      <value>12</value>
    </field>
    <field name="In_HR_NumEmp_5">
      <value>11</value>
    </field>
    <field name="In_HR_NumMonth_1">
      <value>1</value>
    </field>
    <field name="In_HR_NumMonth_2">
      <value>3</value>
    </field>
    <field name="In_HR_NumMonth_3">
      <value>4</value>
    </field>
    <field name="In_HR_NumMonth_4">
      <value>5</value>
    </field>
    <field name="In_HR_NumMonth_5">
      <value>1</value>
    </field>
    <field name="In_HR_PerTime_1">
      <value>2</value>
    </field>
    <field name="In_HR_PerTime_2">
      <value>5</value>
    </field>
    <field name="In_HR_PerTime_3">
      <value>2</value>
    </field>
    <field name="In_HR_PerTime_4">
      <value>11</value>
    </field>
    <field name="In_HR_PerTime_5">
      <value>03</value>
    </field>
    <field name="Opex_LifeAmount">
      <value>23424</value>
    </field>
    <field name="Opex_Yr">
      <value>12</value>
    </field>
    <field name="Opex_YrAmount">
      <value>123123</value>
    </field>
    <field name="Opex_YrAmountP">
      <value>345235</value>
    </field>
    <field name="OutScope_1">
      <value>outScope 1</value>
    </field>
    <field name="OutScope_2">
      <value>outScope 2</value>
    </field>
    <field name="OutScope_3">
      <value>outScope 3</value>
    </field>
    <field name="OutScope_4">
      <value>outScope 4</value>
    </field>
    <field name="OutScope_5">
      <value>outScope 5</value>
    </field>
    <field name="Out_HR_Area_1">
      <value>sdfsg</value>
    </field>
    <field name="Out_HR_Area_2">
      <value>dffnjgmkgm,k</value>
    </field>
    <field name="Out_HR_Area_3">
      <value>3453wygh</value>
    </field>
    <field name="Out_HR_Area_4">
      <value>235hc bn</value>
    </field>
    <field name="Out_HR_Area_5">
      <value>234dfg435</value>
    </field>
    <field name="Out_HR_NumEmp_1">
      <value>1</value>
    </field>
    <field name="Out_HR_NumEmp_2">
      <value>6</value>
    </field>
    <field name="Out_HR_NumEmp_3">
      <value>34</value>
    </field>
    <field name="Out_HR_NumEmp_4">
      <value>2</value>
    </field>
    <field name="Out_HR_NumEmp_5">
      <value>78</value>
    </field>
    <field name="Out_HR_NumMonth_1">
      <value>23</value>
    </field>
    <field name="Out_HR_NumMonth_2">
      <value>23</value>
    </field>
    <field name="Out_HR_NumMonth_3">
      <value>1</value>
    </field>
    <field name="Out_HR_NumMonth_4">
      <value>235</value>
    </field>
    <field name="Out_HR_NumMonth_5">
      <value>52</value>
    </field>
    <field name="Out_HR_PerTime_1">
      <value>3</value>
    </field>
    <field name="Out_HR_PerTime_2">
      <value>1</value>
    </field>
    <field name="Out_HR_PerTime_3">
      <value>11</value>
    </field>
    <field name="Out_HR_PerTime_4">
      <value>1</value>
    </field>
    <field name="Out_HR_PerTime_5">
      <value>1</value>
    </field>
    <field name="Problem Statement">
      <value>Prob Statemewnt dghdheh</value>
    </field>
    <field name="ProjSponName_First">
      <value>ProjSponFirst</value>
    </field>
    <field name="ProjSponName_Last">
      <value>ProjSponLast</value>
    </field>
    <field name="Project Description">
      <value>Project Desc</value>
    </field>
    <field name="Project Name">
      <value>ProjName</value>
    </field>
    <field name="ProposedMeasure_1">
      <value>Prop meas 1 </value>
    </field>
    <field name="ProposedMeasure_2">
      <value>prop meas 2</value>
    </field>
    <field name="ProposedMeasure_3">
      <value>prop meas 3 </value>
    </field>
    <field name="ProposedMeasure_4">
      <value>prop meas 4</value>
    </field>
    <field name="ProposedMeasure_5">
      <value>prop meas 5</value>
    </field>
    <field name="Savings_Yr">
      <value>2</value>
    </field>
    <field name="Savings_YrAmount">
      <value>12313</value>
    </field>
    <field name="Savings_YrAmountP">
      <value>234254</value>
    </field>
    <field name="Submit" />
    <field name="SubmitByName_First">
      <value>SubNameFirst</value>
    </field>
    <field name="SubmitByName_Last">
      <value>SubNameLast</value>
    </field>
    <field name="TempHeadCount">
      <value>2</value>
    </field>
    <field name="TempMonths">
      <value>4</value>
    </field>
  </fields>
  <ids original="FA098893AF1E1140A46335ED9EEEA5CE" modified="A3D8A6418EFE9E439F3CADCF3350958D" />
</xfdf>

EXCEL VBA RAW:

Sub AddFormData()    

    Dim FName As String, FD As FileDialog
    Dim WApp As Object, WDoc As Object, WDR As Object
    Dim ExR As Range    

    Set FD = Application.FileDialog(msoFileDialogOpen)
    FD.Filters.Add "PDF FORM DATA", "*.xfdf", 1
    FD.Show

    If FD.SelectedItems.Count <> 0 Then        
        For yi = 1 To FD.SelectedItems.Count
            FName = FD.SelectedItems(yi)

            '''''''''''''''''''''''''''''''''''''''''
            Set oXMLFile = CreateObject("MSXML2.DOMDocument") '''“Microsoft.XMLDOM”)
            XMLFileName = FName
            oXMLFile.Load (XMLFileName)

            Set List = oXMLFile.SelectNodes("//fields/")    

            ''''''''''''''''''''''''''''''''''''''''

            '''''''''''''''''''''''''''''''''''''
        Next yi    
    End If    
End Sub
Parfait
  • 104,375
  • 17
  • 94
  • 125
PDT352
  • 25
  • 4

1 Answers1

1

Consider XPath, a more expressive parsing tool than walking down .Item, .ChildNodes, .Attributes, and other element collections. Also, you have a default namespace, xmlns="...", in your XML document. To resolve, assign a colon-separated prefix like pdf to default namespace to access all nodes in its scope. Below is a generalized version to walk down nested levels of tree:

Set oXMLFile = CreateObject("MSXML2.DOMDocument") '''“Microsoft.XMLDOM”)
XMLFileName = FName
oXMLFile.Load XMLFileName

' TEMPORARILY DEFINE PREFIX FOR DEFAULT NAMESPACE
oXMLFile.setProperty "SelectionNamespaces", "xmlns:pdf='http://ns.adobe.com/xfdf/'"

' PRINT ALL THIRD NESTED LEVEL NODE VALUES
Set myList = oXMLFile.SelectNodes("/*/*/*")
For Each var In myList
    Debug.Print var.Text
Next var
' 2
' 3
' 1
' D:23440101
' D:13240101
' D:12340101

' PRINT ALL THIRD NESTED LEVEL ATTRIBUTES VALUES
Dim var as Variant
...
Set myList = oXMLFile.SelectNodes("/*/*/*/@*")
For Each var In myList
    Debug.Print var.Text
Next var
' 3 C_AQS_1
' 3 C_AQS_2
' 3 C_AQS_3
' 3 C_AYS_1
' 3 C_AYS_2
' 3 C_AYS_3

Should you need to access indexed values, you can still use XPath for node or attribute value

' PRINT INDEXED NODE AND ATTRIBUTE
Set myList = oXMLFile.SelectNodes("/*/*/*")         ' MOVE TO SECOND NESTED LEVEL
Debug.Print myList(0).SelectSingleNode("*").Text
Debug.Print myList(0).SelectSingleNode("@*").Text
' 2
' 3 C_AQS_1

Debug.Print myList(1).SelectSingleNode("*").Text
Debug.Print myList(1).SelectSingleNode("@*").Text
' 3
' 3 C_AQS_2

Debug.Print myList(2).SelectSingleNode("*").Text
Debug.Print myList(2).SelectSingleNode("@*").Text
' 1
' 3 C_AQS_3

And in loop

Dim var as Variant
...
For Each var In myList
    If Not var.SelectSingleNode("*") Is Nothing Then
        Debug.Print var.SelectSingleNode("*").Text
        Debug.Print var.SelectSingleNode("@*").Text
    End If
Next var
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • This is part of a larger program and due to constraints I cannot call out the field name directly (it changes and is arbitrary). I need to be able to access the field via an index/'walking down' approach. – PDT352 Nov 04 '19 at 22:50
  • Then remove attribute condition and use `//pdf:field` and loop through node list or use index as shown with `myList(0)`, `myList(1), `myList(2)`... – Parfait Nov 04 '19 at 22:59
  • My initial post already does all of this - I need to know how to access the nested attribute to read it's value. I'm just looking for proper syntax for navigating through the nest. Is it possible to access it in a format such as: myList(n).attributes(1).nodeValue The value of n will change, I have a loop that will be looking for the value at this nested location. See highlight https://i.imgur.com/RneA7uu.png – PDT352 Nov 04 '19 at 23:32
  • See updated solution with a generalized version for traversing nested elements by loop or index. You can still use XPath to retrieve node and attribute value. – Parfait Nov 05 '19 at 14:53
  • I understand now, didn't know about .SelectSingleNode() or the @, * XPath operators. Thank you! The method works a charm. – PDT352 Nov 05 '19 at 20:18