0

I have a string on the following format:

abc\(def\(ghj)klm)nop\(qrs)tuvw\(xyz\(abc\(def)ghi)jkl)mno)pqr)\(stu)

enter image description here

My current regex is :

(\\\((?:\[??[^\"\n]*?\)))

and it is give me the output in the image, how can I update this regex to achieve the expectation

Hassan Taleb
  • 2,350
  • 2
  • 19
  • 24
  • 2
    Regex aren't suited to parse tree-hierarchy structures. Although this *may* be possible, you'll end up with a complicated RegEx hard to maintain and easy to break – Cid Mar 11 '21 at 07:29
  • Sounds way easier to match the number of `\(` with the number of `)` using a *good old loop* – Cid Mar 11 '21 at 07:30
  • I second the loop approach. What language do you use? If regex, how many levels deep? If, say max 3 levels, you could build a longer regex that first matches for 3 levels, then 2 levels, then 1 level. – Peter Thoeny Mar 11 '21 at 08:05

2 Answers2

1

Recursion can helps you with that:

\\\(([^()]|(?R))*\)

Example here

1

There are regex dialects that support recursions, namely Perl 5.10, PCRE 4.0, Ruby 2.0. You can match your string with a regex engine that supports recursion as @Vladimir Trifonov pointed out in his answer.

If your regex engine does not support recursion you can match without recursion by building a longer regex that supports up to a max level of nesting, here using your input string as an example:

  • max 1 level: /(\\\([^\)]*\))/g
    • match: ["\(def\(ghj)", "\(qrs)", "\(xyz\(abc\(def)", "\(stu)"]
  • max 2 levels: /(\\\((?:[^\\\)]*\\\([^\)]*\))?[^\)]*\))/g
    • match: ["\(def\(ghj)klm)", "\(qrs)", "\(xyz\(abc\(def)ghi)", "\(stu)"]
  • max 3 levels: /(\\\((?:[^\\\)]*\\\((?:[^\\\)]*\\\([^\)]*\))?[^\)]*\))?[^\)]*\))/g
    • match: ["\(def\(ghj)klm)", "\(qrs)", "\(xyz\(abc\(def)ghi)jkl)", "\(stu)"]
  • etc.

In other words, you can increase the max level by nesting (?:[^\\\)]*\\\( ... [^\)]*\))? patterns; they are made optional with ?.

EDIT: In case you need to support unlimited nesting and your regex engine does not support recursion, you can use a series of regexes. See JavaScript example at https://stackoverflow.com/a/66558399/7475450 and Perl example at https://twiki.org/cgi-bin/view/Blog/BlogEntry201109x3

Peter Thoeny
  • 7,379
  • 1
  • 10
  • 20