What is the use of max m in the lazy quantifiers {n,m}??

Question

In regex, we have greedy and lazy quantifiers. The greedy quantifier {n,m} matches the preceding atom/character/group a minimum of n and a maximum of m occurrences, inclusive.

If I have a collection of strings:

a
aa
aaa
aaaa
aaaaaaaaaa

With a{2,4}, it matches:

nothing on first line
aa on second
aaa on third
aaaa on fourth
(aaaa), (aaaa), and (aa) on fifth line

That makes sense.

However, if I have a lazy quantifier a{2,4}? I get:

nothing on first line
aa on second line
aa on third line
(aa) and (aa) on fourth line
(aa), (aa), (aa), (aa), and (aa) on fifth line

That actually makes sense. It finds the least amount of possible match.

The part that I want to clarify - is there any usefulness to pass any lazy quantifier in the form of {n,m}? a max value m (in this case, the 4 in {2,4}?)? Isn't the result is always the same as {2,}??

Is there a scenario where passing a max (like the 4 in {2,4}?) is useful in lazy quantifier?

Disclaimer: I am actually using the regular expression to search inside Vim (/a{-2,4}), not in any scripting language. I think the principle of the question is still the same.

It depends on the regex library/implementation. What is the regex flavor/programming language? — Wiktor Stribiżew, Jan 17 '22 at 21:18

score 0 · Accepted Answer · answered Jan 17 '22 at 16:57

0

It matters when you need to consider what follows the lazily quantified expression. Laziness is used to prevent characters from being consumed by a later expression in a concatenation. Consider the string aaaaab:

The string is not matched by a{2,4}?b, as there are too many as for a{2,4} to match.
The string is matched by a{2,}?b, since it can match as many as as necessary.

answered Jan 17 '22 at 16:57

chepner

497,756
71
530
681

In case of aaaaab, a{2,4}b would match aaaab, right? However, I was expecting a{2,4}?b to match aab but when I tried it instead it matches aaaab. Interesting. – Iggy Jan 18 '22 at 00:10
I think the issue is that lazy or not, it's still going to start as far left as possible. Consider something like `re.match(r'a*(a{2,4}?b)', 'aaaa')`. – chepner Jan 18 '22 at 00:14

What is the use of max m in the lazy quantifiers {n,m}??

1 Answers1