1

Upon testing code that creates a XML document, I saw that after a string that ends with a linebreak is being written, the WriteEndElement on the next line is not indented at all. The WriteStartElement after that is correctly indented.

I tried all three settings for NewLineHandling without a change. The element can not contain the CDATA annotation as the target system wouldn't know what to do with that.

My code:

using System;
using System.Xml;   
using System.Text;
using System.IO;

public class Program
{
    public static void Main()
    {
        XmlWriterSettings settings = new XmlWriterSettings();
        settings.Async = false;
        settings.CloseOutput = false;
        settings.ConformanceLevel = ConformanceLevel.Document;
        settings.Encoding = Encoding.UTF8; // Encoding.GetEncoding(28605);
        settings.Indent = true;
        settings.IndentChars = "  ";
        settings.NamespaceHandling = NamespaceHandling.Default;
        settings.NewLineChars = Environment.NewLine;
        settings.NewLineHandling = NewLineHandling.Entitize;
        settings.NewLineOnAttributes = false;
        settings.OmitXmlDeclaration = false;
        settings.WriteEndDocumentOnClose = false;
        using (MemoryStream ms = new MemoryStream())
        {
            using (XmlWriter xmlWriter = XmlWriter.Create(ms, settings))
            {
                xmlWriter.WriteStartDocument();
                xmlWriter.WriteStartElement("RootNode");
                xmlWriter.WriteStartElement("subnode");
                xmlWriter.WriteString("12\r\n354\r\n");
                xmlWriter.WriteEndElement();
                xmlWriter.WriteStartElement("guid");
                xmlWriter.WriteString(Guid.NewGuid().ToString("D"));
                xmlWriter.WriteEndElement();
                xmlWriter.WriteEndElement();
                xmlWriter.WriteEndDocument();
                xmlWriter.Flush();
            }

            ms.Flush();
            ms.Position = 0;
            using (StreamReader reader = new StreamReader(ms))
            {
                Console.Write(reader.ReadToEnd());
            }
        }       
    }
}

Expected output:

<?xml version="1.0" encoding="utf-8"?>
<RootNode>
  <subnode>12&#xD;
354&#xD;
  </subnode>
  <guid>5c712399-c7b3-45e1-be3d-d5f6718e07b9</guid>
</RootNode>

Output I get:

<?xml version="1.0" encoding="utf-8"?>
<RootNode>
  <subnode>12&#xD;
354&#xD;
</subnode>
  <guid>6ffbd53c-6b9d-482c-b006-ca5f6d40293d</guid>
</RootNode>

As you can see, the </subnode> tag is not properly aligned and no matter what I try, it keeps looking like that. To repeat: Even if I set NewLineHandling to Entitize the </subnode> tag is not indented correctly - which I don't understand.

This happens for .NET 4.5.2 and .NET 4.7.2.

Steffen Winkler
  • 2,805
  • 2
  • 35
  • 58
  • `XmlWriter` is behaving as documented. `` consists of text data (or mixed content) and so the writer will not insert any whitespace. From the [docs](https://learn.microsoft.com/en-us/dotnet/api/system.xml.xmlwritersettings.indent?view=netframework-4.8): *The elements are indented as long as the element does not contain mixed content. Once the WriteString or WriteWhitespace method is called to write out a mixed element content, the XmlWriter stops indenting. The indenting resumes once the mixed content element is closed.* – dbc Nov 14 '19 at 19:32
  • See: [C# junk characters break XElement “pretty” representation](https://stackoverflow.com/q/54030664/3744182). – dbc Nov 14 '19 at 19:34
  • @dbc I'm not sure I'm following. Doesn't all alpha-numeric text consist out of character data? Or to be more precise: Shouldn't it also mis-format the guid element? Contrary to the question you linked I also do not have elements and text mixed in a parent element. – Steffen Winkler Nov 15 '19 at 00:54
  • The `` element has no indentation spacing automatically added - which happens to be what you want. The `` element also has no indentation spacing automatically added (remember you're adding the CRLF's yourself) - which is not what you want. But in either case `XmlWriter()` is doing what is documented, which is disabling indentation once `WriteString()` is called. – dbc Nov 15 '19 at 01:44
  • But I agree the docs could be clearer. They say indentation is disabled for elements with *mixed content* but they really mean *mixed or text content*. – dbc Nov 15 '19 at 01:45
  • *Doesn't all alpha-numeric text consist out of character data?* - not exactly, XML has a distinction between "significant" and "insignificant" whitespace, see e.g. https://www.tutorialspoint.com/xml/xml_white_spaces.htm and https://www.w3.org/TR/2006/REC-xml-20060816/#sec-white-space. In an element with text or mixed content all whitespace is significant. If the element contains nothing but whitespace (and child elements) the whitespace is generally assumed to be insignificant, unless otherwise specified by the schema. – dbc Nov 15 '19 at 05:53
  • 1
    @dbc oh, now I see what you meant. Thank you for the detailed explanation. If yout would put that in an answer, I would mark it. Or I could close this since it is essentially dupe of the question you linked above. – Steffen Winkler Nov 15 '19 at 08:24

0 Answers0