5

Given an alternation like /(foo|foobar|foobaz)/ does Perl 6 make any promises about which of the three will be used first, and if it does where in the documentation does it make that promise?

See the related question Does Perl currently (5.8 and 5.10) make any promises about the order alternations will be used?.

Elizabeth Mattijsen
  • 25,654
  • 3
  • 75
  • 105
Chas. Owens
  • 64,182
  • 22
  • 135
  • 226
  • 3
    It strikes me that Perl 6 makes a lot of promises. Until Larry gives us an actual release date, I won't believe any of them :-) – bedwyr Apr 20 '09 at 00:27
  • Heh, well, rules seem to be mostly done and I don't expect any changes to them at this point. That said, if it changes then the answer here can change as well. – Chas. Owens Apr 20 '09 at 00:30

2 Answers2

13

To put it only a few words: the alternatives should be matched (at least notionally) in parallel, and the longest match wins. If you want sequential alternations, you can use the double bar ||, which promises a left-to-right order just like | does in Perl 5 regexes.

moritz
  • 12,710
  • 1
  • 41
  • 63
10

S05 says

To that end, every regex in Perl 6 is required to be able to distinguish its "pure" patterns from its actions, and return its list of initial token patterns (transitively including the token patterns of any subrule called by the "pure" part of that regex, but not including any subrule more than once, since that would involve self reference, which is not allowed in traditional regular expressions). A logical alternation using | then takes two or more of these lists and dispatches to the alternative that matches the longest token prefix. This may or may not be the alternative that comes first lexically.

However, if two alternatives match at the same length, the tie is broken first by specificity. The alternative that starts with the longest fixed string wins; that is, an exact match counts as closer than a match made using character classes. If that doesn't work, the tie broken by one of two methods. If the alternatives are in different grammars, standard MRO (method resolution order) determines which one to try first. If the alternatives are in the same grammar file, the textually earlier alternative takes precedence. (If a grammar's rules are defined in more than one file, the order is undefined, and an explicit assertion must be used to force failure if the wrong one is tried first.)

This seems to be a very different promise from the one made in Perl 5.

Community
  • 1
  • 1
Chas. Owens
  • 64,182
  • 22
  • 135
  • 226
  • Reinforcing Moritz's answer & providing doc links... P6 has two built-in alternation types. It uses `|` for the [Longest](https://docs.perl6.org/language/regexes#Longest_Alternation:_|) type and `||` for the [Leftmost](https://docs.perl6.org/language/regexes#Alternation:_||) type. The text you've quoted applies to the new Longest alternation type which is indeed completely different from the Leftmost type which corresponds to P5's type. (If you use [`:P5`](https://docs.perl6.org/language/5to6-nutshell#Add_:P5_or_:Perl5_adverb) you can use `|` as it's used in P5.) – raiph Jun 13 '18 at 08:52