0

I looked around for unflattening procedures through XSL, but none of them really works for me, although I believe my case is pretty simple. I have a collection of HTML, always the same structure, I would like to unflatten through XSL transformation. Basically it is about encapsulating in a <div> element all the elements following a <p class='subtitle'> up to the next <p class='subtitle'>, and – ideally! – still applying transformation to the elements individually, but that is optional (see below).

Source file looks like:

[...some stuff on the page]
<p class='header'>Some text</p>
<p class='subtitle'>Subtitle 1</p>
<p class='content'>First paragraph of part 1, with some <span>Inside</span> and other 
nested elements, on multiple levels</p>
<ul>a list with <li> inside</ul>
<p class='content'>Second paragraph of part 1</p>
<img src='xyz.jpg'/>
<p class='content'>Third paragraph of part 1</p>
<p class='subtitle'>Subtitle 2</p>
<p class='content'>First paragraph of part 2</p>
<p class='content'>Second paragraph of part 2</p>
<p class='subtitle'>Subtitle 3 
[and so on…]

And I would like to turn this into :

<div n='section1'>
    <head>Subtitle 1</head>
    <p>First paragraph of part 1, with some <span>Inside</span> and other and other 
     nested elements, on multiple levels</p>
    <ul>a list with <li> inside</ul>
    <p>Second paragraph of part 1</p>
    <picture source='xyz.jpg'/>
    <p>Third paragraph of part 1</p>
</div>
<div n="section2">
    <head>Subtitle 2</head>
    <p>First paragraph of part 2</p>
    <p>Second paragraph of part 2</p>
</div>
<div n="Section 3">
    <head>Subtitle 3</head>
    [and so on…]

I cannot find my way around this issue. Also, if a first step would only unflatten the HTML file (strictly copying the elements inside the div without transformation), this would already be amazing.

THANKS in advance!

DonRamiro
  • 51
  • 6

1 Answers1

1

This is a classic positional grouping problem. To get you started:

<xsl:template match="body">
  <body>
    <xsl:for-each-group select="*" group-starting-with="p[@class='subtitle']">
      <xsl:choose>
        <xsl:when test="@class="subtitle">
          <div n="section{position()}">
            <head>{.}</head>
            <xsl:apply-templates select="tail(current-group())"/>
          </div>
        </xsl:when>
        <xsl:otherwise>
           <xsl:apply-templates select="current-group()"/>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:for-each-group>
  </body>
</xsl:template>

Note that xsl:for-each-group requires XSLT 2.0 or later. It's considerably more difficult with XSLT 1.0.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164