0

I have this XML input:

<?xml version="1.0" encoding="utf-8"?>
<string>
&lt;N/A&gt;
</string>

Here is a short code sample to illustrate the problem:

uses
  xmldom, oxmldom, XMLDoc, XMLIntf;

procedure TForm1.Test;
var
  Document     : IXMLDocument;
  StringNode   : IXMLNode;
  LessThanNode : IXMLNode;
begin
  DefaultDOMVendor := 'Open XML';
  Document         := LoadXMLData(Memo1.Lines.Text);
  StringNode       := Document.DocumentElement;
  LessThanNode     := StringNode.ChildNodes.First;
  ShowMessage(LessThanNode.Text); // Displays '' (an emtpy string)
  ShowMessage(LessThanNode.XML);  // Displays '&lt;'
  ShowMessage(StringNode.Text);   // Causes an EXMLDocError, because the string node contains more than just a single node with NodeType = ntText
end;

How can I get the Open XML parser to transform the &lt;, &gt and similar XML entities to their real text (like < and >)?

I could write a workaround for the predefined entities in the XML specification: http://www.w3.org/TR/2008/REC-xml-20081126/#sec-predefined-ent

That won't help with additional entity nodes though ...

Related: Why doesn't IXMLNode.IsTextElement return True for CDATA elements?

Community
  • 1
  • 1
Jens Mühlenhoff
  • 14,565
  • 6
  • 56
  • 113

2 Answers2

1

In your case I think the InnerText property should work.

ShowMessage(Document.DocumentElement.InnerText);

Edit: The InnerText property is not part of the IXMLNode interface (I think MSXML has it.) The OpenXML implementation (ADOM) has a GetTextContent method that probably does the same thing, so you may want to look into it.

Leonardo Herrera
  • 8,388
  • 5
  • 36
  • 66
0

Newer versions of Delphi don't ship the oxmldom unit anymore and newer versions of the so called ADOM are available:

http://www.philo.de/xml/downloads.shtml

So either using a different parser or upgrading OpenXML solves the problem.

Jens Mühlenhoff
  • 14,565
  • 6
  • 56
  • 113