In Delphi XE2, I'm doing a xslt transform on a received XML file to remove all namespace information.
Problem: It changes
<?xml version="1.0" encoding="utf-8"?>
into
<?xml version="1.0" encoding="utf-16"?>
This is the XML that I get back from Exchange server:
<?xml version="1.0" encoding="utf-8"?>
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
<s:Header>
<h:ServerVersionInfo MajorVersion="14" MinorVersion="0" MajorBuildNumber="722" MinorBuildNumber="0" Version="Exchange2010" xmlns:h="http://schemas.microsoft.com/exchange/services/2006/types" xmlns="http://schemas.microsoft.com/exchange/services/2006/types" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"/>
</s:Header>
<s:Body xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<m:ResolveNamesResponse xmlns:m="http://schemas.microsoft.com/exchange/services/2006/messages" xmlns:t="http://schemas.microsoft.com/exchange/services/2006/types">
<m:ResponseMessages>
<m:ResolveNamesResponseMessage ResponseClass="Success">
<m:ResponseCode>NoError</m:ResponseCode>
<m:ResolutionSet TotalItemsInView="1" IncludesLastItemInRange="true">
<t:Resolution>
<t:Mailbox>
<t:Name>developer</t:Name>
<t:EmailAddress>developer@timetellbv.nl</t:EmailAddress>
<t:RoutingType>SMTP</t:RoutingType>
<t:MailboxType>Mailbox</t:MailboxType>
</t:Mailbox>
<t:Contact>
<t:Culture>nl-NL</t:Culture>
<t:DisplayName>developer</t:DisplayName>
<t:GivenName>developer</t:GivenName>
<t:EmailAddresses>
<t:Entry Key="EmailAddress1">SMTP:developer@timetellbv.nl</t:Entry>
</t:EmailAddresses>
<t:ContactSource>ActiveDirectory</t:ContactSource>
</t:Contact>
</t:Resolution>
</m:ResolutionSet>
</m:ResolveNamesResponseMessage>
</m:ResponseMessages>
</m:ResolveNamesResponse>
</s:Body>
</s:Envelope>
This is the function that removes the namespace info:
Uses
MSXML2_TLB; // IXMLDOMdocument
class function TXMLHelper.RemoveNameSpaces(XMLString: String): String;
const
// An XSLT script for removing the namespaces from any document.
// From http://wiki.tei-c.org/index.php/Remove-Namespaces.xsl
cRemoveNSTransform =
'<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">' +
'<xsl:output method="xml" indent="no"/>' +
'<xsl:template match="/|comment()|processing-instruction()">' +
' <xsl:copy>' +
' <xsl:apply-templates/>' +
' </xsl:copy>' +
'</xsl:template>' +
'<xsl:template match="*">' +
' <xsl:element name="{local-name()}">' +
' <xsl:apply-templates select="@*|node()"/>' +
' </xsl:element>' +
'</xsl:template>' +
'<xsl:template match="@*">' +
' <xsl:attribute name="{local-name()}">' +
' <xsl:value-of select="."/>' +
' </xsl:attribute>' +
'</xsl:template>' +
'</xsl:stylesheet>';
var
Doc, XSL: IXMLDOMdocument2;
begin
Doc := ComsDOMDocument.Create;
Doc.ASync := false;
XSL := ComsDOMDocument.Create;
XSL.ASync := false;
try
Doc.loadXML(XMLString);
XSL.loadXML(cRemoveNSTransform);
Result := Doc.TransFormNode(XSL);
except
on E:Exception do Result := E.Message;
end;
end; { RemoveNameSpaces }
But after this, it's suddenly a utf-16 document:
<?xml version="1.0" encoding="UTF-16"?>
<Envelope>
[snip]
</Envelope>
After Googling "xsl utf-8 utf-16" I tried several things:
Change the line (e.g. Output DataTable XML in UTF8 rather than UTF16)
<xsl:output method="xml" indent="no">
into either:
<xsl:output method="xml" encoding="utf-8" indent="no"/> <xsl:output method="xml" encoding="utf-8"/> <xsl:output encoding="utf-8"/>
That did not work.
(It would be the optimal solution, according to http://www.xml.com/pub/a/2002/09/04/xslt.html "The encoding attribute actually does more than add an encoding declaration to the result document; it tells the XSLT processor to write out the result using that encoding.")Change the line (e.g. XslCompiledTransform uses UTF-16 encoding)
<xsl:output method="xml" indent="no"/>
into
<xsl:output method="xml" omit-xml-declaration="yes" indent="no" />
which leaves out the starting xml tag, but if I then just prepend
<?xml version="1.0" encoding="utf-8"?>
I will lose characters because no actual utf conversion is done.
IXMLDOMdocument2 does not have an
Encoding
property
Any ideas how to fix this?
Remarks/background:
If all else fails there's maybe still the option to change the utf-16 XML data to utf-8, but that's an entirely different approach.
I'm trying to do everything utf-8 because I'm communicating with Exchange server through EWS, and setting the http request header to utf-16 does not work: Exchange tells me that the content-type 'text/xml; charset = utf-16' is not the expected type 'text/xml; charset = utf-8'. EWS returns utf-8 (see start of post).