Javadoc for java.util.regex.Pattern
says \cx
represents The control character corresponding to x. So I thought Pattern.compile()
would reject a \c
followed by any character other than [@-_]
, but it doesn't!
As @tchrist commented on one of the answers to What is a regular expression for control characters?, range is not checked at all. I tested a couple characters from higher blocks and also astral planes, looks like it merely flips the 7th lowest bit of the codepoint value.
So is it a Javadoc bug or an implementation bug or am I misunderstanding something? Is \cx
a Java-invented syntax or is it supported by other regex engines, especially Perl? How is it handled there?