I tried to use regexes for finding max-length sequences formed from repeated doubled letters, like AABB
in the string xAAABBBBy
.
As described in the official documentation:
The
'*'
,'+'
, and'?'
quantifiers are all greedy; they match as much text as possible.
When I use the quantifier {n,}
, I get a full substring, but +
returns only parts:
import re
print(re.findall("((AA|BB){3,})", "xAAABBBBy"))
# [('AABBBB', 'BB')]
print(re.findall("((AA|BB)+)", "xAAABBBBy"))
# [('AA', 'AA'), ('BBBB', 'BB')]
Why is {n,}
more greedy than +
?