0

I would like to write Latin accents inside an XML text file with the corresponding ISO code. For instance é is replaced by é on the file. My issue is my first & is escaped by Xml.Linq. So the result is &#amp;233;.

internal void SaveCurrentFile(XElement root)
{
    var encoding = Encoding.GetEncoding("ISO-8859-1");
    XmlWriterSettings xmlWriterSettings = new XmlWriterSettings
    {
        Indent = true,
        OmitXmlDeclaration = false,
        Encoding = encoding
    };
    using (var writer = XmlWriter.Create(GetFolderPath() + "test.xml", xmlWriterSettings))
    {
        root.Save(writer);
    }
}

I did not see any options in XmlWriterSettings to help me.

Thank you for your help.

abatishchev
  • 98,240
  • 88
  • 296
  • 433
M07
  • 1,060
  • 1
  • 14
  • 23
  • You don't actually need to do anything. Just write the character normally and make sure to use UTF8. – SLaks Oct 29 '17 at 19:24
  • 1
    You could write with `Encoding.ASCII`. If you do, all non-ASCII Unicode characters will get escaped as shown in [Escaping unicode string in XmlElement despite writing XML in UTF-8](https://stackoverflow.com/q/18006146/3744182). – dbc Oct 29 '17 at 19:32
  • I could not use UTF8 (due to personal restrictions) but with the ASCII encoding it looks good. Thank you! – M07 Oct 29 '17 at 19:45
  • How does this restriction arise? XML text is Unicode characters, regardless of the document encoding. – Tom Blodget Oct 30 '17 at 01:55
  • The XML files is consumes by a game. I cannot control how it reads it and I already tried to give Utf8 files and it did not work. – M07 Oct 30 '17 at 15:02

1 Answers1

1

Numeric character entity references are Unicode codepoints. For example,
🚲 for (U+1F6B2). é would be é (U+00E9). Numeric character entity references are typically used only when the document encoding doesn't support a particular Unicode codepoint. An XML writer takes care of this for you.

So if you want é to be emitted as numeric character entity, you just have to use a document encoding that doesn't support it. ASCII is one. However, it will probably be in hexadecimal (é) rather than decimal (é).

Tom Blodget
  • 20,260
  • 3
  • 39
  • 72
  • That is correct. I have seen that the input is (é) but it is fine for me in hexadecimal too. – M07 Oct 30 '17 at 15:04