2

I am using java for my program
suppose i have a string such as this

xx(yyzz(iijj))qq((kkll)(gghh))

Is there any way in which i could match xx(yyzz(iijj)) and qq((kkll)(gghh)) separately using a regex?

Sufian Latif
  • 13,086
  • 3
  • 33
  • 70
Cedric Mamo
  • 1,724
  • 2
  • 18
  • 33
  • 1
    I've seen it referred to BF in mixed company... but good luck with that. BF looks like a chore. – The Real Baumann Feb 02 '12 at 18:55
  • It's not so bad. I got the expression generator working quite well (if you only use constants :D), as well as expressions, a random number generator, and a tool to convert strings into compact bf code. This problem has arisen when trying to add functionality to be able to call functions from within an expression or when accessing a value in an array from within an expression – Cedric Mamo Feb 02 '12 at 18:59
  • Why do you want to use a regular expression to find the matching parentheses instead of using another method? – Anderson Green Jun 28 '13 at 21:33
  • Just to see if it could be done. Which it totally can in some flavors of regex such as .net's – Cedric Mamo Jun 29 '13 at 07:53

2 Answers2

7

The simple answer is no, there's not a way to do it using just a regular expression. Just iterate through the string and push the open parentheses onto a stack. Pop when you hit closed parentheses. If you try to pop or you finish and the stack isn't empty then it's invalid.

You could also do this recursively by removing the first index of '(' and lastIndex of ')' verifying that the index of '(' is less than the index of ')'

The Real Baumann
  • 1,941
  • 1
  • 14
  • 20
  • ok thanks. I like a straight answer. May need to rethink my current code then. Thanks for the quick reply :) – Cedric Mamo Feb 02 '12 at 18:55
  • And it's done. Implemented something such as you described, except i kept a counter instead of a stack. increment the counter for each ( found and decrement it for each ). if the counter reaches 0 it means i've hit a match as i described it in the question. Thanks :) I would vote your answer up, but i don't have enough rep yet :S – Cedric Mamo Feb 02 '12 at 23:51
  • @CedricMamo you should be able to mark an answer as accepted by clicking on the check mark below the upvote/downvote. Thanks. – The Real Baumann Feb 03 '12 at 00:04
  • @CedricMamo Can you post the code you implemented as an answer, then? – Anderson Green Jun 28 '13 at 21:34
4

You can match nested parenthesis using regex up to a fixed level. But more than 2 level will become rather messy (2 already is, to be frank). This will match your examples:

\(([^()]*+|\([^()]*+\))*\)

A quick explanation:

\(              # match a '(' 
(               # open group 1
  [^()]*+       #   match any chars other than '(' and ')'
  |             #   OR
  \([^()]*+\)   #   match '(...)'
)*              # close group 1 and repeat it zero or more times
\)              # match a '(' 

See the demo on ideone.com

There are regex flavors that can match an arbitrary number of nesting (Perl, .NET, PHP), but Java is not one of them.

But looking at the comment you posted under your question, I'd not handle this with regex, but a proper parser (be it a handcrafted one, or a generated).

Bart Kiers
  • 166,582
  • 36
  • 299
  • 288