0

I want to remove character other than letters and number between two symbol which are < and > with empty string. The string is <F=*A*B*C*>

 (?<=F=|\G(?!^))[A-Za-z1-9]*\K[^A-Za-z1-9]+

 //output:<F=ABC 

 (?:^<F=(?=.+>$)|\G(?!^))[A-Za-z1-9]*\K[^A-Za-z1-9]+
 
 //output:<F=ABC 

This regex pattern capture last closing tag too and removed it (<F=ABC). How to make it stop at specific symbol and avoid it from capture last closing tag.

When I add > in [^A-Za-z1-9], it can remove characters other than > symbol correctly.

(?<=F=|\G(?!^))[A-Za-z1-9]*\K[^A-Za-z1-9>]+

//output: <F=ABC>// desired result

what is correct way to define stop matching start from this symbol? Thank you.

Premlatha
  • 1,676
  • 2
  • 20
  • 40

1 Answers1

1

You can use

(?:\G(?!^)|<F=)[^<>]*?\K[^A-Za-z0-9<>]+(?=[^<>]*>)

See the regex demo.

Details:

  • (?:\G(?!^)|<F=) - either the end of the previous match or <F= text
  • [^<>]*? - any zero or more chars other than < and >, as few as possible
  • \K - match reset operator that discards the text matched so far from the overall match memory buffer
  • [^A-Za-z0-9<>]+ - one or more chars other than ASCII letters/digits and < and > chars
  • (?=[^<>]*>) - immiediately on the right, there must be zero or more chars other than < and > and then a > char.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563