0

I use XmlDocument class for loading XML and XmlWriter class for generating the file. I wanted to preserve the decimal character entities (characters in bold) that's present in the xml, like the one below,

<car id="wait for the signal &#10;&#10; then proceed">

Tried options like XmlTextReader, but had no luck. After processing the above line in the file looks something like below,

    <car id="wait for the signal

then proceed">

or

<car id="wait for the signal &#xA;&#xA; then proceed">

XmlWriter code block i used,

XmlWriterSettings xmlWriterSettings = new XmlWriterSettings
{
    Indent = true,
    Encoding = encoding,
    NewLineHandling = NewLineHandling.None
};
XmlDataDocument xmlDataDocument = new XmlDataDocument
{
    PreserveWhitespace = true
};
xmlDataDocument.LoadXml(xmlString);
using (XmlWriter writer = XmlWriter.Create(filepath, xmlWriterSettings))
{
    if (writer != null)
    {
        xmlDataDocument.Save(writer);
    }
}

any help one this is much appreciated.

  • What do you want to achieve ? What you mean with "preserve". Either it's converted to a Linefeed, or it remains . Both options you have shown, both do not satisfy you ? The topic here is XML-Encoding. – Holger Oct 25 '19 at 06:57

2 Answers2

0

I am unsure what you are trying to achieve here.

Both &#10; and &#xA; are equivalent. They both reference a newline. One in is decimal form, the other is in hexadecimal form. Either will work with the XML parser.

If you read the official XML Documentation, it specifies that either method of referencing is allowed.

Why do you need to preserve the decimal form?

Espen
  • 2,456
  • 1
  • 16
  • 25
  • My requirement is I've in file before processing, which gets replaced with after the file is processed. Any option where I can retain the decimal entity in file? – Sarath Kumar Harikrishnan Oct 25 '19 at 17:13
  • My point is that and are the same, they both mean "10" one in decimal, and one in hexadecimal. The parser doesn't care which one you use, and parsers and encoders will interchange them. Why would you need to preserve this form? – Espen Oct 27 '19 at 09:01
  • Thanks for clarifying on this. Even i was thinking the same. I've an external team, they kind of had trouble parsing those character references in XML. – Sarath Kumar Harikrishnan Oct 28 '19 at 06:08
0

As far as an XML parser is concerned, it will always treat a numeric character reference in exactly the same way as it treats the character itself.

Your only possible way forward is to preprocess the file before the XML parser gets to see it, replacing the & with some other character such as §. And then of course, reverse the process afterwards.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164