I have the following ANTLR grammar that forms part of a larger expression parser:
grammar ProblemTest;
atom : constant
| propertyname;
constant: (INT+ | BOOL | STRING | DATETIME);
propertyname
: IDENTIFIER ('/' IDENTIFIER)*;
IDENTIFIER
: ('a'..'z'|'A'..'Z'|'0'..'9'|'_')+;
INT
: '0'..'9'+;
BOOL : ('true' | 'false');
DATETIME
: 'datetime\'' '0'..'9'+ '-' '0'..'9'+ '-' + '0'..'9'+ 'T' '0'..'9'+ ':' '0'..'9'+ (':' '0'..'9'+ ('.' '0'..'9'+)*)* '\'';
STRING
: '\'' ( ESC_SEQ | ~('\\'|'\'') )* '\''
;
fragment
HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;
fragment
ESC_SEQ
: '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
| UNICODE_ESC
| OCTAL_ESC
;
fragment
OCTAL_ESC
: '\\' ('0'..'3') ('0'..'7') ('0'..'7')
| '\\' ('0'..'7') ('0'..'7')
| '\\' ('0'..'7')
;
fragment
UNICODE_ESC
: '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
;
If I invoke this in the interpreter from within ANTLR works with
'Hello\\World'
then this is getting interpreted as a propertyname instead of a constant. The same happens if I compile this in C# and run it in a test harness, so it is not a problem with the dodgy interpreter
I'm sure I am missing something really obvious... but why is this happening? It's clear there is a problem with the string matcher, but I would have thought at the very least that the fact that IDENTIFIER does not match the ' character would mean that this would throw a NoViableAltException instead of just falling through?