1

I am trying to extract the XML information from an XFA form using VBA.

Below code works to extract the XML data to a separate file, but it requires user interaction (the user is requested to give the XML file a name). I have given up trying to automate this without user interaction due to Adobe's "safe path" requirement (which seems impossible to bypass with a VBA automation).

Dim objPDDoc As New AcroPDDoc
Dim objJSO As Object
Dim strSafePath as String

strSafePath = ""

objPDDoc.Open (FileName)
Set objJSO = objPDDoc.GetJSObject
objJSO.xfa.host.exportdata strSafePath, 0

What I would rather do is to parse the XML information directly using MSXML2.DOMDocument60. I was hoping to be able to do something like this:

Dim XMLDoc As New MSXML2.DOMDocument60

If XMLDoc.Load(objJSO.xfa.host.exportdata) = True Then
    Call funcParse(XMLDoc)
End if

However, loading XMLDoc with objJSO.xfa.host.exportdata doesn't work, and I cannot seem to figure out which - if any - possibilities there are to pass the XML information using any xfa.host methods/properties.

Any help is welcome - also telling me this is not possible in VBA.

CM2020
  • 68
  • 7

2 Answers2

0

Try something like this:

myXMLstring = "<XML>BLA</XML>"
Dim xmlDoc As MSXML2.DOMDocument60
Set xmlDoc = New MSXML2.DOMDocument60
xmlDoc.LoadXML myXMLstring

See for a better example: See e.g. this post: https://desmondoshiwambo.wordpress.com/2012/07/03/how-to-load-xml-from-a-local-file-with-msxml2-domdocument-6-0-and-loadxml-using-vba/

Koen Rijnsent
  • 230
  • 1
  • 13
  • Thanks for responding Koen. But my challenge is extracting the XML nested in the XFA, so that I can pass it to XMLDoc (through Load or LoadXML). So using your answer as an example; how do I generate the myXMLstring from a XML nested in an XFA form? – CM2020 Feb 26 '20 at 17:35
  • Mmm, I don't have full Acrobat on my system, so can't test it. The documentation has as clue "A path cannot point to a system critical folder, for example a root, windows or system directory. A path is also subject to other unspecified tests. For many methods, the file name must have an extension appropriate to the type of data that is to be saved. Some methods may have a no-overwrite restriction. These additional restrictions are noted in the documentation." So you could try some locations & file names to see what works and after that read/load that file (and remove it)? – Koen Rijnsent Feb 27 '20 at 09:43
  • Thank you for your thoughts on this Koen. I have yet to succeed in finding a path that will allow me to automatically save the XML file. Therefore, I was hoping that passing the XML data to a parser without having to save it at all would be a simpler solution. However, your proposal may be the best chance of success. – CM2020 Mar 01 '20 at 21:03
0

Original poster here. After about a year of looking into this on-and-off, I found the solution.

After having accessed the JavaScript object through AccroPDDoc.GetJSObject, I can extract the nested XML as a string by using objJSO.xfa.this.saveXML.

This way, I don't have to first save the nested XML to file (which would require user interaction) - instead I can immediatly extract the nested XML and pass it to the parser.

Dim objPDDoc as New AcroPDDoc
Dim objJSO as Object
Dim XMLDoc As New MSXML2.DOMDocument60

ObjPDDoc.Open (Filename)
Set objJSO = objPDDoc.GetJSObject
If XMLDoc.LoadXML (objJSO.xfa.this.saveXML) = True then
     ParseXML(XMLDoc)
End if
CM2020
  • 68
  • 7