5

I have the following XML:

<t>a_35345_0_234_345_666_888</t>

I would like to replace the first occurrence of number after "_" with a fixed number 234. So the result should look like:

<t>a_234_0_234_345_666_888</t>

I have tried using the following but it does not work:

<xsl:stylesheet version="2.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xs="http://www.w3.org/2001/XMLSchema">
   <xsl:template match="/">
     <xsl:value-of select='replace(., "(.*)_\d+_(.*)", "$1_234_$2")'/>
   </xsl:template>
</xsl:stylesheet>

UPDATE

The following works for me (thanks @Chris85). Just remove the underscore and add "? to make it non greedy.

<xsl:stylesheet version="2.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   xmlns:xs="http://www.w3.org/2001/XMLSchema">
   <xsl:template match="/">
    <xsl:value-of select='replace(., "(.*?)_\d+(.*)", "$1_234$2")'/>

   </xsl:template>
 </xsl:stylesheet>
M Tracker
  • 75
  • 1
  • 1
  • 5
  • 1
    What happens currently? I think you need to make it non-greedy `.*?`. e.g. `` – chris85 Jun 22 '15 at 23:23
  • Hi @Chris85 - thanks that worked! Is it possible to change the expression so that a word boundary is used at the end instead of "_"? The usual word boundary (\b) is not supported in XSLT . I Am using XSLT 2.0. Thank u! – M Tracker Jun 23 '15 at 14:32
  • I'm not sure I don't work with XSLT often (once a year or less). I work with regexs much more frequently could you describe the problem you're encountering and maybe there's another approach to it? – chris85 Jun 23 '15 at 14:52
  • The following works for me. Just remove the underscore. replace(., "(.*?)_\d+(.*)", "$1_234$2") – M Tracker Jun 23 '15 at 14:55
  • 1
    Please update your question with the regex you are currently using and what you'd like it to accomplish. – chris85 Jun 23 '15 at 14:56
  • Is there something else you are trying to accomplish though or is this resolved? It sounded like using the ending `_` wasn't working for you. – chris85 Jun 23 '15 at 15:20

1 Answers1

3

Your regex is/was greedy, the .* consumes everything until the last occurrence of the next character.

So

(.*)_\d+_(.*)

was putting

a_35345_0_234_345_666_

into $1. Then 888 was being removed and nothing went into $2.

To make it non-greedy add a ? after the .*. This tells the * to stop at the first occurrence of the next character.

Functional example:

<xsl:stylesheet version="2.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   xmlns:xs="http://www.w3.org/2001/XMLSchema">
   <xsl:template match="/">
    <xsl:value-of select='replace(., "(.*?)_\d+(.*)", "$1_234$2")'/>
   </xsl:template>
 </xsl:stylesheet>

Here's some more documentation on repetition and greediness, http://www.regular-expressions.info/repeat.html.

chris85
  • 23,846
  • 7
  • 34
  • 51