2

I wanted to try out some things. Now i have tried to split a string into 100 blocks. So this is how far i gotten:

<xsl:template name="split100check">
    <xsl:param name="input"></xsl:param>
    <xsl:variable name="newInput" select="concat(normalize-space($input), ' ' )"></xsl:variable>
    <xsl:variable name="start" select="substring($input, 1, 100)"></xsl:variable>
    <xsl:variable name="end" select="substring($input, 101)"></xsl:variable>
    <PART>
        <xsl:value-of select="$start"></xsl:value-of>
    </PART>
    <xsl:if test="$end">
        <xsl:call-template name="split100check">
            <xsl:with-param name="input" select="$end"></xsl:with-param>
        </xsl:call-template>
    </xsl:if>9
</xsl:template>

So this does almost what i like to achieve. It takes a string into 100 blocks, but it splits also the words. For example :

<main>
    <long>
    A very long text here [....] only for test
    </long>
</main>

Let's say the first 100 block ends at the word "only" but in the middle of it. so the first block would be "A very long text [....] on" and the second block "ly for test". So how do i need to build that template to do what i want ?

info : i can only use XSLT 1.0

Edit : To make it more clear an example with 10 blocks split :

Text: "Hello my friend" -> split it into 10blocks would be with my approach :

first block : <PART>Hello my f</PART>

second block : <PART>riend</PART>

I want the words to be not splited like this :

first block : <PART>Hello my </PART>

second block : <PARTR>friend </PART>

The first block ofc is now not anymore exactly 10 chars long but that does not matter. It shall put as many words as fit in a 10 chars block.

gz ALeks

Aleks
  • 74
  • 8
  • You didn't even really say what you want. What is "a word" for you, in technical terms? Where do you want the split to occur when it would happen in a word? -- If you define word as "anything delimited by space", want the split in front of words *and* are reasonably confident that word distribution is such that no words longer than 100 characters exist (think URLs etc), you can make use of [the `substring-before-last` template I've created for a different question](http://stackoverflow.com/questions/1119449/removing-the-last-characters-in-an-xslt-string/1119666#1119666). – Tomalak Oct 19 '11 at 07:17
  • Ahh sorry for not beeing clear enough. Words are words for me :). Let's say i have like above mentioned between the tag a very long text. The [...] should symbolize the long text (just imagine a long text like a discription of something). Just for explaining reasons : let us say we want to split into 10 chars then the text : "Hello my friend" would be split into "Hello my f" and "riend". But i want to have : "Hello my " and "friend". So it shall recognize if the last word does not fit in the, in this case, 10 block and put it in the next block. – Aleks Oct 19 '11 at 10:45
  • And what if there is no word boundary for 10 characters? Then it should break forcibly, I suppose? What if the only word boundary is at character 2? Should it break so early and leave an awkwardly short line? (Also, did you give that `substring-before-last()` template a try?) – Tomalak Oct 19 '11 at 12:19

1 Answers1

1

You can use the str-split-to-lines template from FXSL.

Here is an example:

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:f="http://fxsl.sf.net/"
xmlns:ext="http://exslt.org/common"
xmlns:str-split2lines-func="f:str-split2lines-func"
exclude-result-prefixes="xsl f ext str-split2lines-func"
>


   <xsl:import href="dvc-str-foldl.xsl"/>

   <!-- to be applied on text.xml -->

   <str-split2lines-func:str-split2lines-func/>

   <xsl:output indent="yes" omit-xml-declaration="yes"/>

    <xsl:template match="/">
      <xsl:call-template name="str-split-to-lines">
        <xsl:with-param name="pStr" select="/*"/>
        <xsl:with-param name="pLineLength" select="60"/>
        <xsl:with-param name="pDelimiters" select="' &#9;&#10;&#13;'"/>
      </xsl:call-template>
    </xsl:template>

    <xsl:template name="str-split-to-lines">
      <xsl:param name="pStr"/>
      <xsl:param name="pLineLength" select="60"/>
      <xsl:param name="pDelimiters" select="' &#9;&#10;&#13;'"/>

      <xsl:variable name="vsplit2linesFun"
                    select="document('')/*/str-split2lines-func:*[1]"/>

      <xsl:variable name="vrtfParams">
       <delimiters><xsl:value-of select="$pDelimiters"/></delimiters>
       <lineLength><xsl:copy-of select="$pLineLength"/></lineLength>
      </xsl:variable>

      <xsl:variable name="vResult">
          <xsl:call-template name="dvc-str-foldl">
            <xsl:with-param name="pFunc" select="$vsplit2linesFun"/>
            <xsl:with-param name="pStr" select="$pStr"/>
            <xsl:with-param name="pA0" select="ext:node-set($vrtfParams)"/>
          </xsl:call-template>
      </xsl:variable>
      <xsl:for-each select="ext:node-set($vResult)/line">
        <xsl:for-each select="word">
          <xsl:value-of select="concat(., ' ')"/>
        </xsl:for-each>
        <xsl:value-of select="'&#xA;'"/>
      </xsl:for-each>
    </xsl:template>

    <xsl:template match="str-split2lines-func:*" mode="f:FXSL">
      <xsl:param name="arg1" select="/.."/>
      <xsl:param name="arg2"/>

      <xsl:copy-of select="$arg1/*[position() &lt; 3]"/>
      <xsl:copy-of select="$arg1/line[position() != last()]"/>

      <xsl:choose>
        <xsl:when test="contains($arg1/*[1], $arg2)">
          <xsl:if test="string($arg1/word) or string($arg1/line/word)">
             <xsl:call-template name="fillLine">
               <xsl:with-param name="pLine" select="$arg1/line[last()]"/>
               <xsl:with-param name="pWord" select="$arg1/word"/>
               <xsl:with-param name="pLineLength" select="$arg1/*[2]"/>
             </xsl:call-template>
          </xsl:if>
        </xsl:when>
        <xsl:otherwise>
          <xsl:copy-of select="$arg1/line[last()]"/>
          <word><xsl:value-of select="concat($arg1/word, $arg2)"/></word>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:template>

      <!-- Test if the new word fits into the last line -->
    <xsl:template name="fillLine">
      <xsl:param name="pLine" select="/.."/>
      <xsl:param name="pWord" select="/.."/>
      <xsl:param name="pLineLength" />

      <xsl:variable name="vnWordsInLine" select="count($pLine/word)"/>
      <xsl:variable name="vLineLength" 
       select="string-length($pLine) + $vnWordsInLine"/>
      <xsl:choose>
        <xsl:when test="not($vLineLength + string-length($pWord) 
                           > 
                            $pLineLength)">
          <line>
            <xsl:copy-of select="$pLine/*"/>
            <xsl:copy-of select="$pWord"/>
          </line>
        </xsl:when>
        <xsl:otherwise>
          <xsl:copy-of select="$pLine"/>
          <line>
            <xsl:copy-of select="$pWord"/>
          </line>
          <word/>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:template>

</xsl:stylesheet>

When this transformation is applied on the following XML document:

<text>
Dec. 13 — As always for a presidential inaugural, security and surveillance were
extremely tight in Washington, DC, last January. But as George W. Bush prepared to
take the oath of office, security planners installed an extra layer of protection: a
prototype software system to detect a biological attack. The U.S. Department of
Defense, together with regional health and emergency-planning agencies, distributed
a special patient-query sheet to military clinics, civilian hospitals and even aid
stations along the parade route and at the inaugural balls. Software quickly
analyzed complaints of seven key symptoms — from rashes to sore throats — for
patterns that might indicate the early stages of a bio-attack. There was a brief
scare: the system noticed a surge in flulike symptoms at military clinics.
Thankfully, tests confirmed it was just that — the flu.
</text>

The wanted justification (lines with length as close to but not exceeding 60) is produced:

Dec. 13 — As always for a presidential inaugural, security 
and surveillance were extremely tight in Washington, DC, 
last January. But as George W. Bush prepared to take the 
oath of office, security planners installed an extra layer 
of protection: a prototype software system to detect a 
biological attack. The U.S. Department of Defense, together 
with regional health and emergency-planning agencies, 
distributed a special patient-query sheet to military 
clinics, civilian hospitals and even aid stations along the 
parade route and at the inaugural balls. Software quickly 
analyzed complaints of seven key symptoms — from rashes to 
sore throats — for patterns that might indicate the early 
stages of a bio-attack. There was a brief scare: the system 
noticed a surge in flulike symptoms at military clinics. 
Thankfully, tests confirmed it was just that — the flu. 
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431