0

As I'm working on a special XSL for a TEI text, I do encounter a big problem. In TEI specifications, standard notes from a book are encoded like that (assuming that albatros is the word that, on a sheet of paper, would receive the note) :

<p>— Cher Cléonte, me dit-il, dès que <pb xml:id="nn2_4" n="4"/> je fus entré dans sa 
chambre, je vous ai envoyé chercher afin que vous prissiez part au divertissement que je
dois avoir aujourd'hui. Je dînai<note type=glossary n="nn2_4n1">déjeunai</note> il y a quelque temps en
un lieu où il se rencontra trois de ces messieurs qui font profession de ne rien ignorer

which is not very xml-friendly, but… that's the standard. What I would like to do is to be able, through an XSL processing, to add some HTML surrounding "dînai" (and the same for every words that is just before a <note> tag).

<span id="nn2_4n1" class="glossary">dînai</span>

(the content of the note is located in another place of the web page.)

The purpose of this is, for instance, to make the content of the note to appear over the word dînai on a simple "hover".

How to select and do something on the last word before a tag ? Is there a way doing this ? With XSL ? In another way ?

Thanks a lot for your answers ! Christophe

P. S: : I'm very sorry for my terrible english.

DonRamiro
  • 51
  • 6
  • 1
    I believe TEI is XML based, so it should be possible. Could you amend your question to show a larger sample of the TEI you are processsing (i.e, the elements that surround 'albatross'), as well as the HTML you would like to generate? Thank you! – Tim C Jul 09 '13 at 17:11
  • And please tell us whether you want to use XSLT 2.0 or 1.0 as `for-each-group ending-with="note"` might do it easily. – Martin Honnen Jul 09 '13 at 17:34
  • Thanks a lot for your interest ! I tried to amend my question in the way you asked for : hope it will be clearer that way ! Nothing prevents me from using XSLT 2.0 has the compliance with old browser is not required. – DonRamiro Jul 09 '13 at 17:44
  • What happens with the `note` elements, do you want them removed or transformed as well? Or do you only want to transform the word before a `note` element? – Martin Honnen Jul 09 '13 at 18:05
  • I would like to get the note element to write it in a certain place of the resulting HTML file. But this seems to be another problem and I don't want to spend to much of your time ;-) – DonRamiro Jul 09 '13 at 18:22

1 Answers1

1

Assuming an XSLT 2.0 processor you can use

<xsl:stylesheet
  version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="xs">

<xsl:template match="@* | node()" mode="#all">
  <xsl:copy>
    <xsl:apply-templates select="@* , node()" mode="#current"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="*[note]">
  <xsl:copy>
    <xsl:for-each-group select="node()" group-ending-with="note">
      <xsl:choose>
        <xsl:when test="current-group()[last()][self::note]">
          <xsl:apply-templates select="current-group()[position() lt last() - 2]"/>
          <xsl:apply-templates select="current-group()[last() - 1]" mode="wrap"/>
        </xsl:when>
        <xsl:otherwise>
          <xsl:apply-templates select="current-group()"/>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:for-each-group>
  </xsl:copy>
</xsl:template>

<xsl:template match="text()" mode="wrap">
  <xsl:analyze-string select="." regex="\w+$">
    <xsl:matching-substring>
      <span id="{current-group()[last()]/@n}" class="{current-group()[last()]/@type}">
        <xsl:value-of select="."/>
      </span>
    </xsl:matching-substring>
    <xsl:non-matching-substring>
      <xsl:value-of select="."/>
    </xsl:non-matching-substring>
  </xsl:analyze-string>
</xsl:template>

</xsl:stylesheet>

It transforms the input

<p>— Cher Cléonte, me dit-il, dès que <pb xml:id="nn2_4" n="4"/> je fus entré dans sa 
chambre, je vous ai envoyé chercher afin que vous prissiez part au divertissement que je
dois avoir aujourd'hui. Je dînai<note type="glossary" n="nn2_4n1">déjeunai</note> il y a quelque temps en
un lieu où il se rencontra trois de ces messieurs qui font profession de ne rien ignorer</p>

into the result

<p>— Cher Cléonte, me dit-il, dès que  je fus entré dans sa 
chambre, je vous ai envoyé chercher afin que vous prissiez part au divertissement que je
dois avoir aujourd'hui. Je <span id="nn2_4n1" class="glossary">dînai</span> il y a quelque temps en
un lieu où il se rencontra trois de ces messieurs qui font profession de ne rien ignorer</p>

That currently drops the note element from the input, if you want it processed at its place change

<xsl:when test="current-group()[last()][self::note]">
  <xsl:apply-templates select="current-group()[position() lt last() - 2]"/>
  <xsl:apply-templates select="current-group()[last() - 1]" mode="wrap"/>
</xsl:when>

to

<xsl:when test="current-group()[last()][self::note]">
  <xsl:apply-templates select="current-group()[position() lt last() - 2]"/>
  <xsl:apply-templates select="current-group()[last() - 1]" mode="wrap"/>
  <xsl:apply-templates select="current-group()[last()]"/>
</xsl:when>

I have assumed a note is preceded by a plain text node as in your sample, if we had e.g. Je <b>dînai</b><note type="glossary" n="nn2_4n1">déjeunai</note> it is more complicated.

And I have only tested with the one simple input sample, test yourself with more complex ones and report back if you encounter problems.

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • Thank you very, very much for your solution ! That's quite incredibly, and definitly nice. I must admit, however, that I don't understand how you proceed. You probably don't have the time, but I can't really understand what the first block does (with the match="@*…) ? And - sorry to be such a noob in regex - but I wasn't able to find what \w+$ refers to ? Anyway : A BIG THANK FOR YOUR HELP ! – DonRamiro Jul 09 '13 at 18:46
  • Well the template with ` – Martin Honnen Jul 10 '13 at 09:36
  • Here is a link to the publishers web site http://www.wrox.com/WileyCDA/WroxTitle/XSLT-2-0-and-XPath-2-0-Programmer-s-Reference-4th-Edition-Print-eBook-Bundle.productCd-1118642929.html. – Martin Honnen Jul 10 '13 at 09:36
  • Thanks a lot for the reference. I will certainly read this. And, even if I'm repeating the same thing : a BIG thank. – DonRamiro Jul 10 '13 at 13:40