2

I need help in creating this regex for replacing the text

Heres the sample input:

<variable class="loves">[loves] My dog loves dog food </variable>

Heres the sample output i am expecting:

<variable class="loves">[loves] My dog hates dog food </variable>

Thank you

The regex i am currently using highlights the word loves written in square brackets as well as outside the square bracket. I want to ignore the word loves written in square bracket

  • I am using this regex - (?<=variable.*>.*)loves(?=.*<\/variable)
  • 1
    You can exclude the square brackets using negative lookarounds `(?<!\[)\bloves\b(?!])` The word boundaries will prevent a partial word match if you also want that. Note that in matching `` and may match more than you would expect. – The fourth bird Jun 21 '23 at 12:49
  • Your output is identical to your input, is this intended? – kwoxer Jun 21 '23 at 15:23
  • Add another look-around, _(?<=variable.*>.*)(?<!\[)loves(?=.*<\/variable)_ – Reilas Jun 21 '23 at 21:21

1 Answers1

0

A most generic solution and rather a quick fix here would look like

(?<=<variable.*>.*)loves(?!(?<=\[[^][]*)[^][]*])(?=.*</variable)

The (?!(?<=\[[^][]*)[^][]*]) negative lookahead placed right after loves makes it match only when the loves substring does not appear in between square brackets with no other square brackets in between.

See this regex demo

Note that the multiple .* parts will make the regex search slower, and the loves part is matched even inside longer words (like gloves). To address the first problem, make use of negated character classes and to address the second one, use word boundaries:

(?<=<variable[^>]*>[^>]*)\bloves\b(?!(?<=\[[^][]*)[^][]*])(?=[^<]*</variable)

See this regex demo.

The [^>]* and [^<]* patterns won't work if you have other tags inside variable tag, so you will have to rely either on .* or a tempered greedy token like (?:(?!</?variable\b).)*.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563