0

I need to tag words regardless of case. I found this nicely working answer which does the job with matched case. I've made some changes to illustrate better the case insensitivity issue...

XML:

<?xml version="1.0" encoding="UTF-8"?>
<file>
    <text>
        <sentence>The safety of the bank’s safe is insured by Safeco.</sentence>
        <sentence>A safe place to shelter during a storm is the cellar.</sentence>
    </text>
</file>

XSLT:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="2.0">
    <!-- ============= -->
    <xsl:template match="sentence"> 
        <xsl:param name="search-term"/>
            <xsl:call-template name="hilite">
                <xsl:with-param name="text" select="."/>
                <xsl:with-param name="search-string" select="$search-term"/>
            </xsl:call-template><xsl:text>
</xsl:text>
    </xsl:template>
    <!-- ============= -->
    <xsl:template name="hilite">
        <xsl:param name="text"/>
        <xsl:param name="search-string"/>
        <xsl:choose>
            <xsl:when test="contains($text, $search-string)">
                <xsl:value-of select="substring-before($text, $search-string)"/>
                <mark>
                    <xsl:value-of select="$search-string"/>
                </mark>
                <xsl:call-template name="hilite">
                    <xsl:with-param name="text" select="substring-after($text, $search-string)"/>
                    <xsl:with-param name="search-string" select="$search-string"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="$text"/>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
    <!-- ============= -->
    <xsl:template match="/"><xsl:text>
</xsl:text>
        <output><xsl:text>
</xsl:text>
            <xsl:apply-templates select="file/text/sentence">
                <xsl:with-param name="search-term">safe</xsl:with-param>
            </xsl:apply-templates>
        </output>
    </xsl:template>
    <!-- ============= -->
</xsl:stylesheet>

The output I get:

<?xml version="1.0" encoding="UTF-8"?>
<output>
    The <mark>safe</mark>ty of the bank’s <mark>safe</mark> is insured by Safeco.
    A <mark>safe</mark> place to shelter during a storm is the cellar.
</output>

But the occurrence of safe in Safeco is not an exact match due to case. So I don't get the output I want:

<?xml version="1.0" encoding="UTF-8"?>
<output>
    The <mark>safe</mark>ty of the bank’s <mark>safe</mark> is insured by <mark>Safe</mark>co.
    A <mark>safe</mark> place to shelter during a storm is the cellar.
</output>

How can I find all occurrences regardless of case and also retain the original case in the output?

dacracot
  • 22,002
  • 26
  • 104
  • 152
  • 1
    If you're able to use XSLT 2.0, then you don't need the recursive template. Use `xsl:analyze-string` instead. – michael.hor257k Apr 29 '19 at 17:46
  • @michael.hor257k I get the analyze-string and with the "i" flag I can achieve the case insensitivity I want, but how do I pass a param value into the analyze-string regex attribute? Rather than using the param content, it seems to use the variable name. – dacracot Apr 29 '19 at 18:21
  • See: https://www.w3.org/TR/xslt20/#attribute-value-templates – michael.hor257k Apr 29 '19 at 18:23
  • @michael.hor257k 1,000 up votes for you sir. – dacracot Apr 29 '19 at 18:24

1 Answers1

1

This is much easier to do in XSLT 2.0 with is support for regular expressions:

XSLT 2.0

<xsl:stylesheet version="2.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:template match="/file">
    <output>
        <xsl:text>&#10;</xsl:text>
        <xsl:apply-templates select="text/sentence">
            <xsl:with-param name="search-term">safe</xsl:with-param>
        </xsl:apply-templates>
    </output>
</xsl:template>

<xsl:template match="sentence">
    <xsl:param name="search-term"/>
    <xsl:analyze-string select="." regex="{$search-term}" flags="i" >
        <xsl:matching-substring>
            <mark>
                <xsl:value-of select="." />
            </mark>
        </xsl:matching-substring>
        <xsl:non-matching-substring>
            <xsl:value-of select="." />
        </xsl:non-matching-substring>
    </xsl:analyze-string>
    <xsl:text>&#10;</xsl:text>
</xsl:template>

</xsl:stylesheet>

Result

<?xml version="1.0" encoding="UTF-8"?>
<output>
The <mark>safe</mark>ty of the bank’s <mark>safe</mark> is insured by <mark>Safe</mark>co.
A <mark>safe</mark> place to shelter during a storm is the cellar.
</output>
michael.hor257k
  • 113,275
  • 6
  • 33
  • 51