0

I am writing a DataSet into an XML file using:

DataSet.WriteXml(XMLWriter, XmlWriteMode.WriteSchema);

XMLWriter encoding is set with Encoding.UTF8.

All works fine with my code, except the dot ('.') character. WriteXml converts special characters into Unicode HEX, for example, space (' ') as x0020 and underscore ('_') as x005F. However, dot ('.') is stored as-is.

How to make sure that dot ('.') is stored as x2024 in the XML file generated using DataSet.WriteXML?

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
Syed
  • 98
  • 7
  • Presumably you're talking about encoding of characters in *element names*. The `.` character is included as-is in the name because it's a valid character to include in an element name. From the [standard](https://www.w3.org/TR/2006/REC-xml-20060816/#NT-NameChar): `NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' | CombiningChar | Extender`. See the [docs](https://msdn.microsoft.com/en-us/library/system.xml.xmlconvert.encodename(v=vs.110).aspx) for `XmlConvert.EncodeName()` for more about how arbitrary strings get encoded as element names. – dbc Jun 15 '17 at 05:51
  • Yes, you got it right. I am specifically inquiring about DataSet.WriteXML functionality, because I am not creating XML file manually. Rather, I create a DataSet and calls its WriteXML method to generate the XML file. In this scenario, if someone has come across a solution then it will be helpful for me – Syed Jun 15 '17 at 06:05
  • 1
    "*is stored as x2024*" - that is done for Unicode codepoint [U+2024 ONE DOT LEADER](http://www.fileformat.info/info/unicode/char/2024/index.htm), which is a very different character than [U+002E FULL STOP](http://www.fileformat.info/info/unicode/char/002e/index.htm) (aka the ASCII "period"). U+002E does not need to be hex encoded. – Remy Lebeau Jun 16 '17 at 04:33

0 Answers0