4

I'm trying to develop an XSLT stylesheet which transforms a given DocBook document to a file which can be fed to the lout document formatting system (which then generates PostScript output).

Doing so requires that I replace a few characters in the text of DocBook elements because they have a special meaning to lout. In particular, the characters

/ | & { } # @ ~ \ "

need to be enclosed in double quotes (") so that lout treats them as ordinary characters.

For instance, a DocBook element like

<para>This is a sample {a contrived one at that} ~ it serves no special purpose.</para>

should be transformed to

@PP
This is a sample "{"a contrived one at that"}" "~" it serves no special purpose.

How can I do this with XSLT? I'm using xsltproc, so using XPath 2.0 functions is not an option but a number of EXSLT functions are available.

I tried using a recursive template which yields the substring up to a special character (e.g. {), then the escaped character sequence ("{") and then calls itself on the substring after the special character. However, I have a hard time making this work properly when trying to replace multiple characters, and one of them is used in the escaped sequence itself.

Frerich Raabe
  • 90,689
  • 19
  • 115
  • 207
  • What is "lout"? Your DocBook example is not a well-formed XML document. Apart from this, the desired replacements can be easily accomplished with the `str-map` template/function of FXSL -- I'll post my answer in 2 hours from now after I'm back home from work. – Dimitre Novatchev Aug 18 '10 at 22:54
  • @Dimitre: See http://sourceforge.net/apps/mediawiki/lout for information about lout; I now introduced a hyperlink to that page into the question. Also, you're right that the example was not well-formed. I adjusted it so that it uses `~` instead of `&`. – Frerich Raabe Aug 18 '10 at 23:07
  • Good question (+1). See my answer for two complete XSLT 1.0 solutions -- with the `str-map` template of FXSL and with manually-written recursive named template. – Dimitre Novatchev Aug 19 '10 at 02:13

1 Answers1

4

In particular, the characters

/ | & { } # @ ~ \ " 

need to be enclosed in double quotes (") so that lout treats them as ordinary characters.

I. This is most easily accomplished using the str-map template of FXSL:

<xsl:stylesheet version="1.0" 
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:f="http://fxsl.sf.net/"
 xmlns:strmap="strmap"
 exclude-result-prefixes="xsl f strmap">
   <xsl:import href="str-dvc-map.xsl"/>

   <xsl:output method="text"/>

   <strmap:strmap/>

   <xsl:template match="/">
     <xsl:variable name="vMapFun" select="document('')/*/strmap:*[1]"/>
     @PP
     <xsl:call-template name="str-map">
       <xsl:with-param name="pFun" select="$vMapFun"/>
       <xsl:with-param name="pStr" select="."/>
     </xsl:call-template>
   </xsl:template>

    <xsl:template name="escape" match="strmap:*" mode="f:FXSL">
      <xsl:param name="arg1"/>

      <xsl:variable name="vspecChars">/|&amp;{}#@~\"</xsl:variable>

      <xsl:variable name="vEscaping" select=
       "substring('&quot;', 1 div contains($vspecChars, $arg1))
       "/>

      <xsl:value-of select=
      "concat($vEscaping, $arg1, $vEscaping)"/>
    </xsl:template>

</xsl:stylesheet>

when this transformation is aplied on the provided XML document:

<para>This is a sample {a contrived one at that} ~ it serves no special purpose.</para>

the wanted, correct result is produced:

@PP This is a sample "{"a contrived one at that"}" "~" it serves no special purpose.

II. With XSLT 1.0 recursive named template:

<xsl:stylesheet version="1.0" 
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:output method="text"/>

   <xsl:template match="/">
     @PP
     <xsl:call-template name="escape">
       <xsl:with-param name="pStr" select="."/>
     </xsl:call-template>
   </xsl:template>

    <xsl:template name="escape">
     <xsl:param name="pStr" select="."/>
     <xsl:param name="pspecChars">/|&amp;{}#@~\"</xsl:param>

     <xsl:if test="string-length($pStr)">
         <xsl:variable name="vchar1" select="substring($pStr,1,1)"/>

          <xsl:variable name="vEscaping" select=
           "substring('&quot;', 1 div contains($pspecChars, $vchar1))
           "/>

          <xsl:value-of select=
          "concat($vEscaping, $vchar1, $vEscaping)"/>

          <xsl:call-template name="escape">
           <xsl:with-param name="pStr" select="substring($pStr,2)"/>
           <xsl:with-param name="pspecChars" select="$pspecChars"/>
          </xsl:call-template>
      </xsl:if>
    </xsl:template>
</xsl:stylesheet>
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • +1: Thanks for providing *two* solutions! I already anticipated that you would easily whip up an FXSL solution (given that it seems to be your own project ;-)) but I'm very grateful for also pointing out how it could be done with plain old XSLT 1.0. I'll try your solutions when I'm out of the office. – Frerich Raabe Aug 19 '10 at 08:36
  • The second variant recurses for every single character, right? That might be an issue for me (since xsltproc, by default, only allows 1000 nested function calls - and my texts might be more than 1000 characters long). I guess the first solution doesn't suffer from this (but I'm not sure, because I find it a bit hard to read ;-)). – Frerich Raabe Aug 19 '10 at 08:46
  • @Frerich-Raabe: There is a nice way to minimize the recursion depth. Please, ask a separate question and I will demo and explain this method. – Dimitre Novatchev Aug 19 '10 at 12:37
  • 1
    @Dimitre: +1 for FXSL and plain recursion. You've became greedy about DVC pattern, ja! –  Aug 19 '10 at 15:07
  • @Alejandro: Few people will notice the DVC explanation if it is buried inside a question with totally unrelated title. This topic deserves its own question. – Dimitre Novatchev Aug 19 '10 at 16:09
  • @Alejandro: Ah! Thanks for mentioning DVC, I didn't think of that! – Frerich Raabe Aug 19 '10 at 21:33
  • @Frerich-Raabe: Actually, DVC *was* mentioned already: look at the code in my solution -- do you notice `` ? :). Yes, for the FXSL solution I already chose the DVC implementation of `str-map`. Isn't it so convenient when you have DVC pre-coded for you and you don't need to code it manually? :) – Dimitre Novatchev Aug 19 '10 at 21:38
  • @Dimitre: Ah, no - I didn't notice. This is the first time I heard about FXSL, I didn't recognize the `dvc` part of the file name as the "divide and conquer" I know from other recursive algorithsm. :-) – Frerich Raabe Aug 19 '10 at 22:14
  • @Frerich-Raabe: If this is the 1st time you heard of FXSL and you've had prior FP exposure, then you'd probably like it -- just read the FXSL 2 (XSLT 2.0 - based) conference article. – Dimitre Novatchev Aug 19 '10 at 22:58
  • @Dimitre: Your paper link is no longer functional. –  Aug 20 '10 at 01:11
  • @@Alejandro, @Frerich-Raabe: Sorry, the PDF link is this: http://web.archive.org/web/20070222111927/http:/www.idealliance.org/papers/extreme/proceedings/xslfo-pdf/2006/Novatchev01/EML2006Novatchev01.pdf The HTML link is this: http://conferences.idealliance.org/extreme/html/2006/Novatchev01/EML2006Novatchev01.html – Dimitre Novatchev Aug 20 '10 at 02:00