1

I want to make a greedy match to an alternative of either zero to 'm' consecutive occurences of 'a' or zero to 'n' consecutive occurences of 'b'. If I do

/a{,m}|b{,n}/

it will not work because when I have sequences of 'b', it will match with 'a{,m}', and the alternative 'b{,n}' will not be looked at, and it will not be a greedy match.

PeeHaa
  • 71,436
  • 58
  • 190
  • 262
sawa
  • 165,429
  • 45
  • 277
  • 381

2 Answers2

1

If I understand what you're trying to do correctly, how about /(?:a{1,m}|b{1,n})?/

It'll match either a string of consecutive a's (up to m times), or a string of consecutive b's (up to n times), or nothing at all due to the optional ?.

Daniel Vandersluis
  • 91,582
  • 23
  • 169
  • 153
  • I think the problem in the original question is that for the sequence "aaaabbbb" @sawa wants to match the whole sequence, rather than just the "a" or "b" sections. – glenatron Mar 31 '11 at 15:59
  • Thanks. That is exactly what I wanted. Why didn't I come up with this? Nice. @glenatron Daniel's regex is what I wanted. Please wait 5 minutes for me to accept the answer. – sawa Mar 31 '11 at 15:59
  • @sawa Keep in mind that `a{1,m}` will match a substring when more than *m* a's are present, unless you somehow [anchor](http://www.regular-expressions.info/anchors.html) your pattern (ie. `/a{1,4}/` will match the first four a's in `aaaaaaaaaa`) – Daniel Vandersluis Mar 31 '11 at 16:04
  • I know that. I just want to match up to m, even if there is more. – sawa Mar 31 '11 at 16:05
0

I think by default, quantifiers are greedy and from left to right. So its not really a greedy issue you were having its the a{0,m} in the alternation matching in the presence of non a's. It would have matched up to 'm' a's had they been present first.

Greediness seem's more complicated than someone might guess.

'aaaaaaaaaa' =~ /(a{1,2}) (a{1,2}?) (a{1,4}) (a{4,12}+)/x &&
print "'$1', '$2', '$3', '$4'";

'aa', 'a', 'aaa', 'aaaa'