1

My XML input looks like:

<?xml version="1.0" ?>
<input>
  <record>
    <name>James Smith</name>
    <country>United Kingdom</country>
    <opt>
      good social skills,
      <qualification>MSc</qualification>,
      10 years of experience
    </opt>
    <section>1B</section>
  </record>
  <record>
    <name>Rafael Pérez</name>
    <country>Spain</country>
    <section>2A</section>
  </record>
  <record>
    <name>Marie-Claire Legrand</name>
    <country>France</country>
    <opt>
      clear voice,
      <qualification>MBA</qualification>,
      3 years of experience
    </opt>
    <section>1B</section>
  </record>

</input>

I want to output the text nodes under the <opt> tag between parentheses, removing the starting and ending spaces and new lines around the contents of its children. This would be very easy if I had only a text child applying the function normalise-space() to it, but this function cannot be applied to a set of nodes.

A MWE of my code looks as follows:

<xsl:stylesheet
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="text" indent="yes" encoding="utf-8"/>

  <xsl:template match="input">
    <xsl:text>------------------------------------------&#xa;</xsl:text>
    <xsl:for-each select="record">
      <xsl:apply-templates 
        select="node()[not(self::text()[not(normalize-space())])]"/>
      <xsl:text>&#xa;------------------------------------------&#xa;</xsl:text>
    </xsl:for-each>
  </xsl:template>

  <xsl:template match="qualification">
    <xsl:choose>
      <xsl:when test=". = 'MBA'">Master in Business Administration</xsl:when>
      <xsl:when test=". = 'MSc'">Master in Sciences</xsl:when>
      <xsl:otherwise><xsl:value-of select="."/></xsl:otherwise>
    </xsl:choose>
  </xsl:template>

  <xsl:template match="name|country">
    <xsl:value-of select="."/>
    <xsl:text>, </xsl:text>
  </xsl:template>

  <xsl:template match="section">
    <xsl:text>Section: </xsl:text>
    <xsl:value-of select="."/>
    <xsl:text>.</xsl:text>
  </xsl:template>

  <xsl:template match="opt">
    <xsl:text>(</xsl:text>
    <xsl:apply-templates/>
    <xsl:text>), </xsl:text>
  </xsl:template>

</xsl:stylesheet>

but gives me a wrong output, having spaces inside of the parentheses, as below:

------------------------------------------
James Smith, United Kingdom, (
      good social skills,
      Master in Sciences,
      10 years of experience
    ), Section: 1B.
------------------------------------------
Rafael Pérez, Spain, Section: 2A.
------------------------------------------
Marie-Claire Legrand, France, (
      clear voice,
      Master in Business Administration,
      3 years of experience
    ), Section: 1B.
------------------------------------------

The output want is:

------------------------------------------
James Smith, United Kingdom, (good social skills,
      Master in Sciences,
      10 years of experience), Section: 1B.
------------------------------------------
Rafael Pérez, Spain, Section: 2A.
------------------------------------------
Marie-Claire Legrand, France, (clear voice, 
      Master in Business Administration, 
      3 years of experience), Section: 1B.
------------------------------------------

I understand I have to modify the template "opt", but I cannot find how.

Pierre François
  • 5,850
  • 1
  • 17
  • 38

2 Answers2

1

Try perhaps something like:

XSLT 1.0

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="utf-8"/>
<xsl:strip-space elements="*"/>

<xsl:template match="/input">
    <xsl:text>------------------------------------------&#xa;</xsl:text>
    <xsl:apply-templates/>
</xsl:template>

<xsl:template match="record">
    <xsl:apply-templates/>
    <xsl:text>&#xa;------------------------------------------&#xa;</xsl:text>
</xsl:template>

<xsl:template match="qualification">
    <xsl:text> </xsl:text>
    <xsl:choose>
        <xsl:when test=". = 'MBA'">Master in Business Administration</xsl:when>
        <xsl:when test=". = 'MSc'">Master in Sciences</xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="."/>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

<xsl:template match="name|country">
    <xsl:value-of select="."/>
    <xsl:text>, </xsl:text>
</xsl:template>

<xsl:template match="section">
    <xsl:text>Section: </xsl:text>
    <xsl:value-of select="."/>
    <xsl:text>.</xsl:text>
</xsl:template>

<xsl:template match="opt">
    <xsl:text>(</xsl:text>
    <xsl:apply-templates/>
    <xsl:text>), </xsl:text>
</xsl:template>

<xsl:template match="opt/text()">
    <xsl:value-of select="normalize-space(.)"/>
</xsl:template>

</xsl:stylesheet>

The result is different than the one you show, but you say it doesn't matter - and the whitespaces inside the parentheses are removed:

------------------------------------------
James Smith, United Kingdom, (good social skills, Master in Sciences, 10 years of experience), Section: 1B.
------------------------------------------
Rafael Pérez, Spain, Section: 2A.
------------------------------------------
Marie-Claire Legrand, France, (clear voice, MBx, 3 years of experience), Section: 1B.
------------------------------------------
michael.hor257k
  • 113,275
  • 6
  • 33
  • 51
0

I accepted the solution of michael.hor257k yesterday but I didn't realize that applying the function normalize-space() to each child node will also remove blanks between them, which was not exactly what I want.

So I found a way to apply the function normalize-space() to a whole subtree by defining a variable computing the whole subtree from all the nodes of the subtree. Once this string is defined, I can apply the function normalize-space() to the string itself, outputting the result I want:

<xsl:stylesheet
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="text" indent="yes" encoding="utf-8"/>

  <xsl:template match="input">
    <xsl:text>------------------------------------------&#xa;</xsl:text>
    <xsl:for-each select="record">
      <xsl:apply-templates 
        select="node()[not(self::text()[not(normalize-space())])]"/>
      <xsl:text>&#xa;------------------------------------------&#xa;</xsl:text>
    </xsl:for-each>
  </xsl:template>

  <xsl:template match="qualification">
    <xsl:choose>
      <xsl:when test=". = 'MBA'">Master in Business Administration</xsl:when>
      <xsl:when test=". = 'MSc'">Master in Sciences</xsl:when>
      <xsl:otherwise><xsl:value-of select="."/></xsl:otherwise>
    </xsl:choose>
  </xsl:template>

  <xsl:template match="name|country">
    <xsl:value-of select="."/>
    <xsl:text>, </xsl:text>
  </xsl:template>

  <xsl:template match="section">
    <xsl:text>Section: </xsl:text>
    <xsl:value-of select="."/>
    <xsl:text>.</xsl:text>
  </xsl:template>

  <xsl:template match="opt">
    <xsl:variable name="text-in-parenthesis">
      <xsl:apply-templates/>
    </xsl:variable>
    <xsl:text>(</xsl:text>
    <xsl:value-of select="normalize-space($text-in-parenthesis)"/>
    <xsl:text>), </xsl:text>
  </xsl:template>
  
</xsl:stylesheet>

With the input:

<?xml version="1.0" ?>
<input>
  <record>
    <name>James Smith</name>
    <country>United Kingdom</country>
    <opt>
      good social skills,
      <qualification>MSc</qualification>,
      10 years of experience
    </opt>
    <section>1B</section>
  </record>
  <record>
    <name>Rafael Pérez</name>
    <country>Spain</country>
    <section>2A</section>
  </record>
  <record>
    <name>Marie-Claire Legrand</name>
    <country>France</country>
    <opt>
      clear voice,
      <qualification>MBA</qualification>,
      3 years of experience
    </opt>
    <section>1B</section>
  </record>

</input>

I get:

------------------------------------------
James Smith, United Kingdom, (good social skills, Master in Sciences, 10 years of experience), Section: 1B.
------------------------------------------
Rafael Pérez, Spain, Section: 2A.
------------------------------------------
Marie-Claire Legrand, France, (clear voice, Master in Business Administration, 3 years of experience), Section: 1B.
------------------------------------------

This gives me the output I was looking for. There are no more indentations, but the elements of the subtree are separated.

Pierre François
  • 5,850
  • 1
  • 17
  • 38
  • 1
    It is difficult to deduce the rules from a single example. If you want to have *some* whitespace separating the nodes, but without keeping *all* of the original whitespace, then you have 2 options: (a) remove all the original whitespace and insert your own or (b) keep the original whitespace and perform normalize-space on the final result. I have selected (a) but your choice of (b) can work just as well. – michael.hor257k Nov 16 '22 at 16:42