3

The wikipedia article on PEG states:

The fundamental difference between context-free grammars and parsing expression grammars is that the PEG's choice operator is ordered. If the first alternative succeeds, the second alternative is ignored. Thus ordered choice is not commutative, unlike unordered choice as in context-free grammars and regular expressions.

But this question has discovered that if the alternatives are substrings of each other; then regexes do not behave according to unordered choice. The wiki is correct for most part but does not take care of this edge condition. Am I correct in my assessment ?

Community
  • 1
  • 1
Frankie Ribery
  • 11,933
  • 14
  • 50
  • 64

1 Answers1

1

"regex" != "regular expression". The latter are pure and simple and of interest only to theoretical computer scientists and symbolic mathematicians.

"ordered choice" is an implementation option of regex processors.

You say "if the alternatives are substrings of each other; then regexes do not behave according to unordered choice".

A much more correct statement would be "Some regex processors use ordered choice for ALL alternations. This becomes noticeable when an alternative is a prefix of another."

John Machin
  • 81,303
  • 11
  • 141
  • 189
  • What you call "regex" is often referred to as regular expressions, and I argue that it's also correct (for example the [JavaDoc of the Java class `Pattern`](http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html) claims that it is "A compiled representation of a regular expression."). It is true, however that there are two related but separate things at work: pure mathematical regular expression and "real-world" computing regular expressions. They are based on the same theories, but are **not** equivalent. – Joachim Sauer May 06 '11 at 06:16
  • Ok. I wasn't aware that they were different. I am working off python's doc which has a "regular expression HOWTO". – Frankie Ribery May 06 '11 at 09:36
  • When "regular expressions" are mentioned in the same breath as "context-free grammars" and "parsing expression grammars", what is meant are the pure and simple regular expressions. – John Machin May 06 '11 at 11:02