1

I get special characters transformed in the result of an xslt file transformation.
Has anyone experienced this before?

In the source document there's a character & which in the result is presented as &. I need the the original & character even in the result.

XmlDataDocument dd = new XmlDataDocument(ds);

XsltSettings settings = new XsltSettings();
settings.EnableDocumentFunction = true;
settings.EnableScript = true;

XslCompiledTransform transform = new XslCompiledTransform();

transform.Load(XmlReader.Create(new StringReader(transformSource.Transform)), settings, new XmlUrlResolver());

XsltArgumentList a = new XsltArgumentList();

a.AddExtensionObject("http://www.4plusmedia.tv", new TransformationHelper());

using (XmlTextWriter writer = new XmlTextWriter(path, System.Text.Encoding.UTF8))
{
    writer.Formatting = Formatting.Indented;
    transform.Transform(dd, a, writer);
}
GSerg
  • 76,472
  • 17
  • 159
  • 346
Florjon
  • 3,569
  • 3
  • 22
  • 29
  • Please be more specific. "I get transformed special characters" doesn't really explain what's happening. Please show input and output, as well as what you *wanted* to happen. – Jon Skeet Nov 18 '11 at 14:40
  • the problem is that in the input exists a [&] character, and in the output this is transformed as [& a m p ;] – Florjon Nov 18 '11 at 14:47
  • 2
    @Florjon: That is not a problem but proper XML encoding. – H H Nov 18 '11 at 14:57
  • Florjon, what kind of output are you looking for with your XSLT stylesheet, plain text or XML or HTML? Within XML and HTML the ampersand is a special character used to start entity or character references and it needs to be escaped as `&` for other uses so the XmlTextWriter is simply doing its job, namely helping to enforce XML syntax rules. If you don't want that then maybe you shouldn't use an XmlTextWriter to transform to. – Martin Honnen Nov 18 '11 at 14:59
  • The `&` character should always be encoded as `&` in an XML document; not doing so would result in invalid XML. – Thomas Levesque Nov 18 '11 at 14:59
  • problem or xml logic: i have a situation that i get a dataset with a column with business names (which is common to have & as part of their names ex H&M, etc.). This names should be parsed throw xml-xslt logic. In the end i don't want to get H&M – Florjon Nov 18 '11 at 15:06
  • 1
    @Florjon You seem to be confusing actual data with its encoding format. It should not concern you that the single `&` characters is represented as multiple characters in XML. `H&M` is exactly what you want. If you load that data to an application and display on the screen, you will see `H&M`. – GSerg Nov 18 '11 at 15:14
  • @GSerg, my result is .txt file. and in this case it is not displayed as you are saying. – Florjon Nov 18 '11 at 15:31
  • 1
    Well assuming you want plain text output you should use `` in your XSLT stylesheet and you should of course not transform to an XmlTextWriter but simply to a FileStream http://msdn.microsoft.com/en-us/library/ms163434.aspx. – Martin Honnen Nov 18 '11 at 15:35
  • Are you saying that your original document is invalid XML and uses a single & rather than escaping it properly? This could happen if the provider of the document rolled their own XML writer without paying attention to the details of the specification. In that case, the best I can recommend is doing a search and replace to restore the invalid XML formatting, though it would be much better if you could fix whatever incorrect code is using the invalid format. – Dan Bryant Nov 18 '11 at 15:36
  • @Martin, thnx for your solution.i had already set [] this in the xlst. the problem was in using xmltextwriter instead of filestream. thnx. you can post as an answer so i can marked it as accepted. – Florjon Nov 18 '11 at 15:51

3 Answers3

1

Your output part is using (XmlTextWriter writer = ...) { ... }

This indicates that your output is XML. You could use XSLT to produce plain text, that would be different.

For XML and HTML output the & encoding is necessary and essential.

At some stage the Value of your Xml elements will be used and that is where (and when) & becomes a & again.

H H
  • 263,252
  • 30
  • 330
  • 514
1

In the source document there's a character & which in the result is presented as &.

Don't panic as there is no problem: this is exactly the same character &, as it should be presented in any well-formed XML document.

You can see that this is exactly the same character, get the string value of the node and output it -- you'll see that just & is output.

Another way to ascertain that this is just the & character is in XSLT to output it when the output method is set to "text". Here is a small, complete example:

XML document:

<t>M &amp; M</t>

Transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>

 <xsl:template match="/*">
  <xsl:value-of select="."/>
 </xsl:template>
</xsl:stylesheet>

Result:

M & M
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
1

If you want XslCompiledTransform to output a plain text file as a result of an XSLT transformation you should not transform to an XmlTextWriter you create, instead transform to a FileStream or TextWriter.

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110