POSIX, aka "The Open Group Base Specifications Issue 7, 2018 edition", has this to say about regular expression operator precedence:
9.4.8 ERE Precedence
The order of precedence shall be as shown in the following table:
ERE Precedence (from high to low) Collation-related bracket symbols [==] [::] [..]
Escaped characters \
special-characterBracket expression []
Grouping ()
Single-character-ERE duplication * + ? {m,n}
Concatenation ab Anchoring ^ $
Alternation |
I am curious as to the reason for the first two levels being in that order. Being a unix user from way back, I am accustomed to being able to "throw a backslash in front of it" to escape virtually anything. But it appears that with Collation-Related-Bracket-Symbols (CRBS), I can't do that. If I want to match a literal [.ch.]
I can't just type \[.ch.]
and rely on "dot matches dot" to handle things for me. I now have to match something like [[].ch.]
(or possibly worse?).
I'm trying, and failing, to imagine what the scenario was when whoever-thought-this-up decided this should be the order. Is there a concrete scenario where having CRBS ranked higher than backslash makes sense, or was this a case of "we don't understand CRBS yet so let's make it higher priority" or ... what, exactly?