0

I have a list of XML elements with values. I'd like to remove any characters or numbers after a specific character (in this case, a period), but only within the someTag element.

<someTag>123.3</someTag>
<someTag>8623.34</someTag>

I'm able to target periods inside the tag using: \.(?=[^<]*</someTag>). However, I can't figure out how to remove the period and everything after it so that the end result would be:

<someTag>123</someTag>
<someTag>8623</someTag>

Any help is greatly appreciated!

g-nice
  • 23
  • 1
  • 4

1 Answers1

0

Your pattern only matches the period. You have to capture everything after it till the tag's closing too. Try this:

(\.[^\<]+)<\/someTag>

[^\<]+ is a negated character set, meaning it will match anything not in the set. [^\<]+ matches until it encounters a < character.

Regex101

Code Different
  • 90,614
  • 16
  • 144
  • 163
  • Nice! This selects everything after the period, including the closing . Out of curiosity, is there a way to keep the closing tag intact, so it doesn't get removed as well? (Note, you could just replace everything with the closing which would work the same way). Thanks! – g-nice Oct 16 '15 at 13:07
  • You don't have to replace the whole match, extract the first capture group and replace it with an empty string. – Code Different Oct 16 '15 at 13:13