2

In continuation with what is asked in below link regarding capitalising first character

Convert First character of each word to upper case

The above link assumes that there is space between each characters. How can I dynamically identify any non-alphanumeric character in between and then capitalise the following letters

For e.g. O'connel derrick should return as O'Connel Derrick and Adrian-merriel james should return as Adrian-Merriel James

I used below code and it works fine for string with space

<xsl:variable name='text' select='"dInEsh sAchdeV kApil Muk"' />
<xsl:variable name='lowers' select='"abcdefghijklmnopqrstuvwxyz"' />
<xsl:variable name='uppers' select='"ABCDEFGHIJKLMNOPQRSTUVWXYZ"' />

<xsl:template match="/">

    <xsl:for-each select='str:split($text, " ")'>
        <xsl:value-of select='concat(
            translate(substring(., 1, 1), $lowers, $uppers),
            translate(substring(., 2), $uppers, $lowers),
            " "
        )' />
    </xsl:for-each>
</xsl:template>

Any help is highly appreciated.

  • As you use an extension like `str:split`, which XSLT 1.0 processor is that, does it support the `replace` function with regular expressions as well? – Martin Honnen Sep 06 '21 at 13:25

2 Answers2

1

For an XSLT 1.0 solution, you could setup a recursive template call that walks over each of the characters and tracks whether or not it has seen a alpha-numeric value, capitalizing the first that it sees, and then resetting when it encounters a non-alpha-numeric value:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml">
  
  <xsl:variable name='text' select='"dInEsh sAchdeV kApil Muk"' />
  <xsl:variable name='lowers' select='"abcdefghijklmnopqrstuvwxyz"' />
  <xsl:variable name='uppers' select='"ABCDEFGHIJKLMNOPQRSTUVWXYZ"' />
  <xsl:variable name='numeric' select='0123456789'/>
  <xsl:variable name='alpha-numeric' select="concat($lowers,$uppers,$numeric)"/>
  <xsl:template match="/">
    
    <xsl:call-template name="capitalize">
      <xsl:with-param name="val" select="$text"/>
    </xsl:call-template>
    
  </xsl:template>
  
  <xsl:template name="capitalize">
    <xsl:param name="val"/>
    <xsl:param name="alphanumeric-seen" select="false()" />
    
    <xsl:variable name="head" select="substring($val, 1, 1)"/>
    
    <xsl:if test="$head">
      <xsl:variable name="is-alpha-numeric" select="not(translate($head, $alpha-numeric, ''))"/>
      <xsl:variable name="tail" select="substring($val, 2)"/>
      <xsl:choose>
        <xsl:when test="$is-alpha-numeric">
          <xsl:choose>
            <xsl:when test="$alphanumeric-seen">
              <xsl:value-of select="translate($head, $uppers, $lowers)"/>
            </xsl:when>
            <xsl:otherwise>
              <xsl:value-of select="translate($head, $lowers, $uppers)"/>
            </xsl:otherwise>
          </xsl:choose>
          <xsl:call-template name="capitalize">
            <xsl:with-param name="val" select="$tail"/>
            <xsl:with-param name="alphanumeric-seen" select="true()"/>
          </xsl:call-template>
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="$head"/>
          <xsl:call-template name="capitalize">
            <xsl:with-param name="val" select="$tail"/>
            <xsl:with-param name="alphanumeric-seen" select="false()"/>
          </xsl:call-template>
        </xsl:otherwise>
      </xsl:choose> 
    </xsl:if>
  </xsl:template>
  
</xsl:stylesheet>

With XSLT 2.0 you could use xsl:analyze-string and regex:

<xsl:stylesheet exclude-result-prefixes="#all" version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml">

  <xsl:variable name='text' select='"dInEsh sAchdeV kApil Muk"' />
  <xsl:variable name='lowers' select='"abcdefghijklmnopqrstuvwxyz"' />
  <xsl:variable name='uppers' select='"ABCDEFGHIJKLMNOPQRSTUVWXYZ"' />
  
  <xsl:template match="/">
   
    <xsl:analyze-string select="$text" regex="[a-zA-Z0-9]+">
      <xsl:matching-substring>
        <xsl:value-of select="upper-case(substring(., 1, 1)), lower-case(substring(., 2))" separator=""/>            
      </xsl:matching-substring>
      <xsl:non-matching-substring>
        <xsl:value-of select="."/>
      </xsl:non-matching-substring>
    </xsl:analyze-string>  
   
  </xsl:template>

</xsl:stylesheet>
Mads Hansen
  • 63,927
  • 12
  • 112
  • 147
  • I tried first code with XSLT 1.0 but it failed because of error invalid attribute 'as' –  Sep 09 '21 at 23:36
  • 1
    Sorry about that, I had left it as a 2.0 stylesheet and had forgotten to change to 1.0. It should work if you remove the type `as="xs:boolean"`. I'll update that answer. – Mads Hansen Sep 09 '21 at 23:38
  • Thanks for quick response. I see that if there is text "firstname lASTname", then the outcome is "Firstname LASTname", but I expect it to be "Firstname Lastname". Please can you have a look –  Sep 10 '21 at 00:09
  • Easy enough, you would just translate to lower-case values for the other characters that are alpha-numeric. I've updated the answer. – Mads Hansen Sep 10 '21 at 00:15
  • 1
    I simply replaced $tail with translate($tail, $uppers, $lowers) and it worked. –  Sep 10 '21 at 00:19
0

I would also use a recursive named template - but instead of going over each character in text I would iterate on delimiters only:

XSLT 1.0 + EXSLT str:split() extension function

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:str="http://exslt.org/strings"
extension-element-prefixes="str">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:param name="text">dInEsh sAchdeV kApil. muk O'connel derrick, Adrian-merriel james</xsl:param>

<xsl:variable name="lower" select="'abcdefghijklmnopqrstuvwxyz'" />
<xsl:variable name="upper" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'" />

<xsl:variable name="punc" select="translate($text, concat($lower, upper), '')" />

<xsl:template match="/">
    <output>
        <xsl:call-template name="capitalize">
            <xsl:with-param name="text" select="translate($text, $upper, $lower)"/>
            <xsl:with-param name="delimiters" select="translate($text, concat($lower, $upper), '')"/>
        </xsl:call-template>
    </output>
</xsl:template>
    
<xsl:template name="capitalize">
    <xsl:param name="text"/>
    <xsl:param name="delimiters"/>
    <xsl:choose>
        <xsl:when test="$delimiters">
            <xsl:variable name="delimiter" select="substring($delimiters, 1, 1)"/>
            <xsl:call-template name="capitalize">
                <xsl:with-param name="text">
                    <xsl:for-each select="str:split($text, $delimiter)">
                        <xsl:value-of select="translate(substring(., 1, 1), $lower, $upper)"/>
                        <xsl:value-of select="substring(., 2)"/>
                        <xsl:if test="position()!=last()">
                            <xsl:value-of select="$delimiter"/>
                        </xsl:if>
                    </xsl:for-each>
                </xsl:with-param>
                <xsl:with-param name="delimiters" select="translate($delimiters, $delimiter, '')"/>
            </xsl:call-template>
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="translate(substring($text, 1, 1), $lower, $upper)"/>
            <xsl:value-of select="substring($text, 2)"/>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

</xsl:stylesheet>

Result

<?xml version="1.0" encoding="UTF-8"?>
<output>Dinesh Sachdev Kapil. Muk O'Connel Derrick, Adrian-Merriel James</output>

However, this is not perfect: a sequence of consecutive delimiter character of the same kind will be reduced to a single character - e.g. alpha---bravo ==> Alpha-Bravo.

michael.hor257k
  • 113,275
  • 6
  • 33
  • 51