I'm experimenting with writing an XSLT stylesheet to generate HTML from a Text encoded in XML according to TEI standard.
Now, when it comes to special characters, I'm running into difficulties - here's an example:
The word "ſem" (normalized "sem", old norse relative pronoun) would be encoded <g ref="#slong"/>em
, which refers to the following declaration in the header:
<glyph xml:id="slong">
<glyphName>LATIN SMALL LETTER LONG S</glyphName>
<mapping type="facs">U+017F</mapping>
<mapping type="norm">s</mapping>
</glyph>
Of course, the idea would be, to be able to look up the mappings for every glyph, and then display it accordingly.
E.g. if I wanted to have a stylesheet that shows a normalized rendering of the text, I'd have something like
<!-- store all my glyphs in a key -->
<xsl:key name="glyphs" match="tei:glyph" use="@xml:id"/>
<!-- handle glyphs, storing every step in a variable for debugging purposes -->
<xsl:template match="tei:g">
<xsl:variable name="g_name" select="substring(@ref,2)"/>
<xsl:variable name="glyph" select="key('glyphs', $g_name)"/>
<xsl:variable name="mapping" select="$glyph/tei:mapping[@type='norm']"/>
<xsl:value-of select="$mapping"/>
</xsl:template>
This would, as expected, output "sem".
But, if I want to write a stylesheet that displays the text diplomatically, I'd want the output to be "ſem".
For that, I started with:
<xsl:template match="tei:g">
<xsl:variable name="g_name" select="substring(@ref,2)"/>
<xsl:variable name="glyph" select="key('glyphs', $g_name)"/>
<xsl:variable name="mapping" select="$glyph/tei:mapping[@type='facs']"/>
<xsl:value-of select="$mapping"/>
</xsl:template>
That gave me "U+017Fem". Of course, that's not a HTML entity for the expected special character.
So I tried:
<xsl:template match="tei:g">
<xsl:variable name="g_name" select="substring(@ref,2)"/>
<xsl:variable name="glyph" select="key('glyphs', $g_name)"/>
<xsl:variable name="mapping" select="$glyph/tei:mapping[@type='facs']"/>
<xsl:variable name="entity" select="concat('&#x',substring($mapping,3),';')"/>
<xsl:value-of select="$entity"/>
</xsl:template>
That outputs ſem
, which looks a lot more like a HTML hex entity. But sadly, it still gets displayed as such, and not interpreted as the character represented by the entity.
And I can't for the life of me figure out, how I get it to do that.
PS: If that helps, I'm not writing a stylesheet to create a HTML file that I open in the browser afterwards; I have a HTML file with a JavaScript function, that converts the XML data to HTML "on the fly".
Edit:
As pointed out by Martin Honnen, on non-Mozilla browsers, <xsl:value-of select="$entity" disable-output-escaping="yes"/>
should suffice (see https://xsltfiddle.liberty-development.net/ejivdH4/2).
Yet, for me, that still doesn't work. So I'm guessing I'm missing something important. Here are my full files (file.xml is shortened/changed, because the original is work in prograss by others, buit the result is the same).
file.xml:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng" type="application/xml" schematypens="http://relaxng.org/ns/structure/1.0"?>
<?xml-model href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng" type="application/xml"
schematypens="http://purl.oclc.org/dsdl/schematron"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
<teiHeader>
<fileDesc>
<titleStmt>
<title>Title</title>
</titleStmt>
<publicationStmt>
<p>Publication Information</p>
</publicationStmt>
<sourceDesc>
<p>Information about the source</p>
</sourceDesc>
</fileDesc>
<encodingDesc>
<charDecl>
<desc>Variant letter forms</desc>
<glyph xml:id="aalig">
<glyphName>LATIN SMALL LIGATURE AA</glyphName>
<mapping type="facs">U+EFA0</mapping>
<mapping type="norm">aa</mapping>
</glyph>
<glyph xml:id="fins">
<glyphName>LATIN SMALL LETTER INSULAR F</glyphName>
<mapping type="facs">U+F207</mapping>
<mapping type="norm">f</mapping>
</glyph>
<glyph xml:id="jscap">
<glyphName>LATIN LETTER SMALL CAPITAL J</glyphName>
<mapping type="facs">U+1DOA</mapping>
</glyph>
<glyph xml:id="nscap">
<glyphName>LATIN LETTER SMALL CAPITAL N</glyphName>
<mapping type="facs">U+0274</mapping>
</glyph>
<glyph xml:id="rrot">
<glyphName>LATIN SMALL LETTER R ROTUNDA</glyphName>
<mapping type="facs">U+A75B</mapping>
<mapping type="norm">r</mapping>
</glyph>
<glyph xml:id="rscap">
<glyphName>LATIN LETTER SMALL CAPITAL R</glyphName>
<mapping type="facs">U+0280</mapping>
</glyph>
<glyph xml:id="slong">
<glyphName>LATIN SMALL LETTER LONG S</glyphName>
<mapping type="facs">U+017F</mapping>
<mapping type="norm">s</mapping>
</glyph>
<glyph xml:id="sscap">
<glyphName>LATIN LETTER SMALL CAPITAL S</glyphName>
<mapping type="facs">U+A731</mapping>
</glyph>
</charDecl>
<charDecl>
<desc>Abbreviation marks</desc>
<glyph xml:id="ar">
<glyphName>LATIN ABBREVIATION SIGN</glyphName>
<mapping type="facs">U+036C</mapping>
</glyph>
<glyph xml:id="asup">
<glyphName>COMBINING LATIN SMALL LETTER A</glyphName>
<mapping type="facs">U+0363</mapping>
</glyph>
<glyph xml:id="bar">
<glyphName>COMBINING ABBREVIATION MARK BAR ABOVE</glyphName>
<mapping type="facs">U+0305</mapping>
</glyph>
<glyph xml:id="combcurl">
<glyphName>COMBINING OGONEK ABOVE</glyphName>
<mapping type="facs">U+1DCS</mapping>
</glyph>
<glyph xml:id="csup">
<glyphName>COMBINING LATIN SMALL LETTER C</glyphName>
<mapping type="facs">U+0368</mapping>
</glyph>
<glyph xml:id="dot">
<glyphName>DOT ABOVE</glyphName>
<mapping type="facs">U+02D9</mapping>
</glyph>
<glyph xml:id="dsup">
<glyphName>COMBINING LATIN SMALL LETTER D</glyphName>
<mapping type="facs">U+0369</mapping>
</glyph>
<glyph xml:id="er">
<glyphName>COMBINING ABBREVIATION MARK ZIGZAG ABOVE</glyphName>
<mapping type="facs">U+035B</mapping>
</glyph>
<glyph xml:id="et">
<glyphName>LATIN ABBREVIATION SIGN SMALL ET WITH STROKE</glyphName>
<mapping type="facs">U+F158</mapping>
<mapping type="norm">&</mapping>
</glyph>
<glyph xml:id="ezh">
<glyphName>LATIN SMALL LETTER EZH</glyphName>
<mapping type="facs">U+0292</mapping>
</glyph>
<glyph xml:id="isup">
<glyphName>COMBINING LATIN SMALL LETTER I</glyphName>
<mapping type="facs">U+0365</mapping>
</glyph>
<glyph xml:id="nsup">
<glyphName>COMBINING LATIN SMALL LETTER N</glyphName>
<mapping type="facs">U+F021</mapping>
</glyph>
<glyph xml:id="osup">
<glyphName>COMBINING LATIN SMALL LETTER O</glyphName>
<mapping type="facs">U+0366</mapping>
</glyph>
<glyph xml:id="ra">
<glyphName>COMBINING LATIN SMALL LETTER FLATTENED OPEN A ABOVE</glyphName>
<mapping type="facs">U+F1C1</mapping>
</glyph>
<glyph xml:id="rsup">
<glyphName>COMBINING LATIN SMALL LETTER R</glyphName>
<mapping type="facs">U+036C</mapping>
</glyph>
<glyph xml:id="tsup">
<glyphName>COMBINING LATIN SMALL LETTER T</glyphName>
<mapping type="facs">U+036D</mapping>
</glyph>
<glyph xml:id="ur">
<glyphName>COMBINING ABBREVIATION MARK SUPERSCRIPT UR ROUND R FORM</glyphName>
<mapping type="facs">U+F153</mapping>
</glyph>
<glyph xml:id="us">
<glyphName>COMBINING US ABOVE</glyphName>
<mapping type="facs">U+1DD2</mapping>
</glyph>
<glyph xml:id="zsup">
<glyphName>COMBINING LATIN SMALL LETTER Z</glyphName>
<mapping type="facs">U+00B3</mapping>
</glyph>
</charDecl>
</encodingDesc>
</teiHeader>
<text>
<body>
<!-- Add your data between here ... -->
<div type="miracle" n="75">
<pb n="473"/>
<head> <lb n="2"/>Bla</head>
<p>
<g ref="#slong"/>em
</p>
</div>
</body>
</text>
</TEI>
page.html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8"/>
<script>
function loadXMLDoc(filename)
{
if (window.ActiveXObject)
{
xhttp = new ActiveXObject("Msxml2.XMLHTTP");
}
else
{
xhttp = new XMLHttpRequest();
}
xhttp.open("GET", filename, false);
try {xhttp.responseType = "msxml-document"} catch(err) {} // Helping IE11
xhttp.send("");
return xhttp.responseXML;
}
function displayResult(style)
{
console.log('Generating...');
xml = loadXMLDoc("file.xml");
xsl = loadXMLDoc(style);
// code for IE
if (window.ActiveXObject || xhttp.responseType == "msxml-document")
{
ex = xml.transformNode(xsl);
document.getElementById("example").innerHTML = ex;
}
// code for Chrome, Firefox, Opera, etc.
else if (document.implementation && document.implementation.createDocument)
{
xsltProcessor = new XSLTProcessor();
xsltProcessor.importStylesheet(xsl);
resultDocument = xsltProcessor.transformToFragment(xml, document);
const node = document.getElementById("example");
while (node.firstChild){
node.removeChild(node.firstChild);
}
node.appendChild(resultDocument);
}
}
</script>
</head>
<body onload="displayResult('facs.xsl')">
<h1>Test</h1>
<div>
<button onclick="displayResult('facs.xsl')">facs</button>
<button onclick="displayResult('dipl.xsl')">dipl</button>
</div>
<div id="example" />
</body>
</html>
facs.xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:tei="http://www.tei-c.org/ns/1.0">
<xsl:key name="glyphs" match="tei:glyph" use="@xml:id"/>
<xsl:template match="/">
<h3>TEI Rendering: Facsimile</h3>
<div>
<xsl:apply-templates select="//tei:div[@type='miracle']"/>
</div>
</xsl:template>
<xsl:template match="tei:div[@type='miracle']">
<h5>
Miracle:
<xsl:value-of select="@n"/>
</h5>
<div class="miracle">
<xsl:apply-templates/>
</div>
</xsl:template>
<xsl:template match="tei:head">
<div style="color:red">
<xsl:apply-templates/>
</div>
</xsl:template>
<xsl:template match="tei:pb">
<br/>
(<xsl:value-of select="@n"/>)
<br/>
</xsl:template>
<xsl:template match="tei:lb">
<br/><xsl:value-of select="@n"/>:
</xsl:template>
<xsl:template match="tei:am">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="tei:g">
<xsl:variable name="g_name" select="substring(@ref,2)"/>
<xsl:variable name="glyph" select="key('glyphs', $g_name)"/>
<xsl:variable name="mapping" select="$glyph/tei:mapping[@type='facs']"/>
<xsl:variable name="entity" select="concat('&#x',substring($mapping,3),';')"/>
<xsl:value-of select="$entity" disable-output-escaping="yes"/>
<xsl:variable name="something" select="'&#x0305;'"/>
{<xsl:value-of select="$something" disable-output-escaping="yes"/>}
</xsl:template>
</xsl:stylesheet>
dipl.xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:tei="http://www.tei-c.org/ns/1.0">
<xsl:template match="/">
<h3>TEI Rendering: Diplomatic</h3>
<div>
<xsl:apply-templates select="//tei:div[@type='miracle']"/>
</div>
</xsl:template>
<xsl:template match="tei:div[@type='miracle']">
<h5>
Miracle:
<xsl:value-of select="@n"/>
</h5>
<div class="miracle">
<xsl:apply-templates/>
</div>
</xsl:template>
<xsl:template match="tei:head">
<div style="color:red">
<xsl:apply-templates/>
</div>
</xsl:template>
<xsl:template match="tei:pb">
||
</xsl:template>
<xsl:template match="tei:lb">
|
</xsl:template>
<xsl:template match="tei:ex">
<i>
<xsl:apply-templates/>
</i>
</xsl:template>
</xsl:stylesheet>
I'm viewing the file as localhost (with a python server running) in my browser.
Any thoughts, what I might be missing or doing wrong?
Note: A lookup-table is not what I want, bevause potentially, there might be as many special characters in a TEI-XML, as there are unicode characters. That's what the glyphe-mappings are here for.
XSLT 2.0 might be an option; but I haven't figured out how to do a 2.0 transformation in the browser via JavaScript.
Edit 2:
I don't know what had gone wrong when I tested it first, but on IE it works with <xsl:value-of select="$entity" disable-output-escaping="yes"/>
.
But since it doesn't work with Firefox, I decided to change the whole design: I transform the XML on server side with PHP and send the HTML to the client; that should work with every browser.