0


  • Any <p> tag within the <body> tags should be transformed to Body_Text.

  • The <p> tags that have a last ancestor <sec> without the attribute "sec-type" should be transformed to Flush_Text (which overrides the first Body_Text transformation here).

  • The <p> tags that have a last ancestor <sec sec-type="irrelevant-attribute-name> (with the attribute "sec-type") should be transformed to Body_Text.




<sec><p>asdf</p></sec> should be transformed into <sec><Flush_Text>asdf</Flush_Text></sec>.

<sec sec-type="whatevs"><p>asdf</p></sec> should be <sec sec-type="whatevs"><Body_Text>asdf</Body_Text></sec>.


Also, any further nesting into an ancestor with this sec-type attribute should still be Body_Text:

<sec sec-type="whatevs"><sec><p>asdf</p></sec></sec> should be <sec sec-type="whatevs"><sec><Body_Text>asdf</Body_Text></sec>.




Here is my XML:
<root>
  <body>
  <sec sec-type="asdf">
    <title>This is an H1</title>

    <sec>
      <title>This is an H2</title>

      <sec>
        <title>This is an H3</title>
        <p>This SHOULD be "Body_Text", but it's "Flush_Text"</p>
      </sec> <!-- end of H3 -->
    </sec> <!-- end of H2 -->
  </sec> <!-- end of H1 -->

  <sec>
    <p>This is Flush_Text</p>
  </sec>
    <p>This is Body_Text</p>
  </body>
</root>


...here is my XSL, which is not working correctly:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes" method="xml"/>
<xsl:strip-space elements="*"/>

    <!-- identity rule -->
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>

        <!-- Body_Text -->
        <xsl:template match="body//p">
            <Body_Text>
                <xsl:apply-templates select="@*|node()"/>
            </Body_Text>
        </xsl:template>

        <!-- Flush_Text -->
        <xsl:template match="sec//p">
          <xsl:if test="not(@sec-type)">
            <Flush_Text>
                <xsl:apply-templates select="@*|node()"/>
            </Flush_Text>
          </xsl:if>
        </xsl:template>

        <!-- H1 -->
        <xsl:template match="sec//title">
            <H1>
                <xsl:apply-templates select="@*|node()"/>
            </H1>
        </xsl:template>

        <!-- H2 -->
        <xsl:template match="sec//sec//title">
            <H2>
                <xsl:apply-templates select="@*|node()"/>
            </H2>
        </xsl:template>

        <!-- H3 -->
        <xsl:template match="sec//sec//sec//title">
            <H3>
                <xsl:apply-templates select="@*|node()"/>
            </H3>
        </xsl:template>
</xsl:stylesheet>


...and here is the incorrect output:

<?xml version="1.0" encoding="utf-16"?>
<root>
    <body>
        <sec sec-type="asdf">
            <H1>This is an H1</H1>
            <sec>
                <H2>This is an H2</H2>
                <sec>
                    <H3>This is an H3</H3>
                    <Flush_Text>This SHOULD be "Body_Text", but it's "Flush_Text"</Flush_Text>
                </sec>
                <!-- end of H3 -->
            </sec>
            <!-- end of H2 -->
        </sec>
        <!-- end of H1 -->
        <sec>
            <Flush_Text>This is Flush_Text</Flush_Text>
        </sec>
        <Body_Text>This is Body_Text</Body_Text>
    </body>
</root>

Note that the first instance of <p> in this example should be transformed to Body_Text, but it is being transformed as Flush_Text.

Ian Campbell
  • 2,678
  • 10
  • 56
  • 104
  • On what grounds is the last `

    ` in the XML document a ``? According to your rules, it should be ``. (Quote: *"`

    ` tags that do not have a highest parent with this attribute should be transformed to `Flush_Text`."*)

    – Tomalak Aug 09 '12 at 22:30
  • Thanks for the help @Tomalak, and sorry for the confusion. So, the last `

    ` tag is *not* within highest-parent of ``, and so should be ``. And, so *also* should the `

    ` within the highest-parent of `` be ``. *Only* the `

    ` within highest-parent `` should be ``.

    – Ian Campbell Aug 09 '12 at 23:32
  • @IanCampbell, Your comment contradicts the text in the question. Also, contributing to the confusion is the fact that you don't provide the wanted results. Also, it is not at all clear what you mean by "highest parent". Please, edit the question and make it meaningful. – Dimitre Novatchev Aug 10 '12 at 01:56
  • Ok thanks @Dimitre, sorry I confused myself with this one. So by my incorrect description of "highest parent" I meant last ancestor. I will edit this question to be more clear otherwise as well. ;) – Ian Campbell Aug 10 '12 at 02:32

2 Answers2

1

Here's a solution that does what I think you want. It's hard to interpret your question.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output omit-xml-declaration="yes" indent="yes" method="xml" />
  <xsl:strip-space elements="*" />

  <!-- identity rule -->
  <xsl:template match="node() | @*">
    <xsl:copy>
      <xsl:apply-templates select="node() | @*" />
    </xsl:copy>
  </xsl:template>

  <xsl:template match="title">
    <xsl:element name="H{count(ancestor::sec) + 1}">
      <xsl:apply-templates select="node() | @*" />
    </xsl:element>
  </xsl:template>

  <xsl:template match="p[ancestor::sec[last()][@sec-type]]">
    <Body_Text>
      <xsl:apply-templates select="node() | @*" />
    </Body_Text>
  </xsl:template>

  <xsl:template match="p">
    <Flush_Text>
      <xsl:apply-templates select="node() | @*" />
    </Flush_Text>
  </xsl:template>

</xsl:stylesheet>

http://www.xmlplayground.com/vmuroB

Tomalak
  • 332,285
  • 67
  • 532
  • 628
0

Ok, so to produce the wanted results here, I have changed the statement <xsl:template match="sec//p"> (in the XSL under Flush_Text) to <xsl:template match="p[ancestor::sec[last()][not(@sec-type)]]">, and also removed the if statement.

Here is the corrected XSL:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes" method="xml"/>
<xsl:strip-space elements="*"/>

    <!-- identity rule -->
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>

        <!-- Body_Text -->
        <xsl:template match="body//p">
            <Body_Text>
                <xsl:apply-templates select="@*|node()"/>
            </Body_Text>
        </xsl:template>

    <!-- Flush_Text -->
    <xsl:template match="p[ancestor::sec[last()][not(@sec-type)]]">
        <Flush_Text>
            <xsl:apply-templates select="@*|node()"/>
        </Flush_Text>
    </xsl:template>

        <!-- H1 -->
        <xsl:template match="sec//title">
            <H1>
                <xsl:apply-templates select="@*|node()"/>
            </H1>
        </xsl:template>

        <!-- H2 -->
        <xsl:template match="sec//sec//title">
            <H2>
                <xsl:apply-templates select="@*|node()"/>
            </H2>
        </xsl:template>

        <!-- H3 -->
        <xsl:template match="sec//sec//sec//title">
            <H3>
                <xsl:apply-templates select="@*|node()"/>
            </H3>
        </xsl:template>
</xsl:stylesheet>

...producing this desired output:

<root>
<body>
<sec sec-type="asdf">
<H1>This is an H1</H1>
<sec>
<H2>This is an H2</H2>
<sec>
<H3>This is an H3</H3>
<Body_Text>This SHOULD be "Body_Text", but it's "Flush_Text"</Body_Text>
</sec>

</sec>

</sec>

<sec>
<Flush_Text>This is Flush_Text</Flush_Text>
</sec>
<Body_Text>This is Body_Text</Body_Text>
</body>
</root>


this was tested at: http://xslt.online-toolz.com/tools/xslt-transformation.php.

Thanks @Tomalak for pointing me in the right direction in the use of the ancestor xpath axis.

Here I have matched the last ancestor (what I was incorrectly calling the "highest parent") <sec> from any <p> that does not have the attribute sec-type, and transformating that as Flush_Text. This is preventing the first instance of <p> in this example, that has <sec sec-type... as its' last ancestor, from being Flush_Text and allows the Body_Text to override.

Also, I like Tomalak's use of automating H1 - H3... I am still experimenting with this, and don't want to use it until I fully understand it ;)

Ian Campbell
  • 2,678
  • 10
  • 56
  • 104