This seems to be something very basic that I don't understand here.
Why doesn't "babc"
match / a * /
?
> "abc" ~~ / a /
「a」
> "abc" ~~ / a * /
「a」
> "babc" ~~ / a * /
「」 # WHY?
> "babc" ~~ / a + /
「a」
This seems to be something very basic that I don't understand here.
Why doesn't "babc"
match / a * /
?
> "abc" ~~ / a /
「a」
> "abc" ~~ / a * /
「a」
> "babc" ~~ / a * /
「」 # WHY?
> "babc" ~~ / a + /
「a」
Because *
quantifier makes the preceding atom match zero or more times.
「」
is first match of / a * /
in any string. For example:
say "xabc" ~~ / a * . /; # OUTPUT: 「x」
it's same:
say "xabc" ~~ / (a+)? . /;
If you set the pattern more precise, you will get another result:
say "xabc" ~~ / x a * /; # OUTPUT: 「xa」
say "xabc" ~~ / a * b /; # OUTPUT: 「ab」
The answers here are correct, I'll just try to present them in a more coherent form:
The regex engine always starts at the left of the strings, and prefers left-most matches over longer matches
*
matches empty stringsThe regex a*
matches can match the strings ''
, 'a'
, 'aa'
etc.
It will always prefer the longest match it finds, but it can't find a match longer than the empty string, it'll just match the empty string.
In 'abc' ~~ /a*/
, the regex engine starts at position 0, the a*
matches as many a's as it can, and thus matches the first character.
In 'babc' ~~ /a*/
, the regex engine starts at position 0, and the a*
can match only zero characters. It does so successfully. Since the overall match succeeds, there is no reason to try again at position 1.