13

I have the following code, which I want to output xml data using the UTF-8 encoding format. but it always outputs data in UTF-16 :

        XslCompiledTransform xslt = new XslCompiledTransform();

            xslt.Load(XmlReader.Create(new StringReader(xsltString), new XmlReaderSettings()));

            StringBuilder sb = new StringBuilder();

            XmlWriterSettings writerSettings = new XmlWriterSettings();
            writerSettings.Encoding = Encoding.UTF8;
            writerSettings.Indent = true;

            xslt.Transform(XmlReader.Create(new StringReader(inputXMLToTransform)), XmlWriter.Create(sb, writerSettings));
Attilah
  • 17,632
  • 38
  • 139
  • 202

2 Answers2

15

The XML output will contain a header that is based on the encoding of the stream, not the encoding specified in the settings. As strings are 16 bit unicode the encoding will be UTF-16. The workaround is to suppress the header and add it yourself instead:

writerSettings.OmitXmlDeclaration = true;

Then when you get the result from the StringBuilder:

string xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n" + sb.ToString();
Guffa
  • 687,336
  • 108
  • 737
  • 1,005
  • Old post but I have a question. I tried this and it worked for my situation but I'm wondering if this work-around actually changes the encoding? My usage is sending some xml data to another system. The system accepts the xml if the declaration says utf-8 but not utf-16. I feel that the actual encoding of the xml string has not changed and I'm just getting lucky that it works. – James Gardner Dec 17 '20 at 19:32
  • 1
    @JamesGardner: It doesn't change the encoding, it makes the header match the actual encoding that is used to create the XML. – Guffa Jan 02 '21 at 15:29
7

If you use a MemoryStream in place of the StringBuilder, the XmlWriter will respect the encoding you specify in the XmlWriterSettings, since the MemoryStream doesn't have an inherent encoding like the StringBuilder does.

Dave Andersen
  • 5,337
  • 3
  • 30
  • 29