0

I have a rather large XSLT template which contains bilingual text (national characters in UTF-8). I am looking for a function that will recode all CDATA elements inside to use XML # entities, allowing me to store the XSLT as plain US-ASCII encoding.

Here is a basic example:

<?xml version="1.0" encoding="UTF-8"?>
<test>Soirée</test>

where é is encoded as C3 A9. The desired output would be

<?xml version="1.0" encoding="US-ASCII"?>
<test>Soir&#233;e</test>

where &#233; corresponds to the codepoint for 'LATIN SMALL LETTER E WITH ACUTE' (U+00E9). Changing the encoding preamble on the first example results in an error as the UTF-8 bytes become invalid.

Is there a simple way to do this or do I have to resort to a macro?

Stavr00
  • 3,219
  • 1
  • 16
  • 28
  • "Changing the encoding preamble…": Yes, the declaration is descriptive (what is), not prescriptive (what to make it). You should be able to let a tool or library write the encoding declaration based on the encoding it was directed to use. So, "how" is the question but for XML Spy, I don't know the answer. (Of course, you could use an XSLT on your XSLT file.) – Tom Blodget Nov 22 '18 at 15:11
  • 1
    For now I am using an XSLT `copy @*|node()` with an output encoding of `US-ASCII`. – Stavr00 Nov 23 '18 at 18:07

0 Answers0