I've been having a bit of trouble with an HTML file that I'm trying to translate. Basically, the relevant part of the source structure as it currently stands is this:
<h2 />
<h3 />
<table />
<table />
<h3 />
<table />
<table />
<h3 />
<table />
<h3 />
<h3 />
<table />
<table />
<h2 />
<h3 />
...
and so on. Each of the contents of these are being translated in different ways, but the problem I'm currently having is in grouping them correctly. Essentially, I want it to end up like the following:
<category>
<h2 />
<container>
<h3 />
<table />
<table />
</container>
<container>
<h3 />
<table />
<table />
</container>
<container>
<h3 />
<table />
</container>
<container>
<h3 />
</container>
<container>
<h3 />
<table />
<table />
</container>
</category>
<category>
<h2 />
<container>
<h3 />
...
to achieve this, I've been using the following code:
<xsl:for-each-group select="node()"group-starting-with="xh:h2">
<category>
<xsl:apply-templates select="xh:h2"/>
<xsl:for-each-group select="current-group()"
group-starting-with="xh:h3">
<container>
<xsl:apply-templates select="current-group()[node()]"/>
</container>
</xsl:for-each-group>
</category>
</xsl:for-each-group>
However, the output I get from this is as follows:
<category>
<h2 />
<container>
<h3 />
<table />
<table />
<h3 />
<table />
<table />
<h3 />
<table />
<h3 />
<h3 />
<table />
<table />
</container>
</category>
<category>
<h2 />
<container>
<h3 />
...
The first for-loop function is working as expected, however the second does not appear to be. If I use <xsl:copy-of
> to output the first element in the <current-group
> in the second for-loop, it shows the <h2
> element, where that element should not even be in the group.
If anyone can point out where I'm going wrong, or offer a better solution, it would be greatly appreciated.
element certainly solved some weird issues I was getting elsewhere, I had no idea about that first group. As you suggest, I have simplified the problem some. As this is dealing with an XHTML representation of a Microsoft Word document, the elements are quite inconsistent, and h2, in reality, is actually found by the condition: `node()[name()='h1' or name()='h2' or @class='Heading2NB' or @class='Heading2NoBreak' or @class='Heading2PageBreak']` and h3 is found by `node()[name()='h3' or @class='Heading3NB' or @class='Heading3NoBreak']`. Thanks again.
– Dan McElroy Jul 04 '13 at 08:14