please consider the following grammar which gives me unexpected behavior:
lexer grammar TLexer;
WS : [ \t]+ -> channel(HIDDEN) ;
NEWLINE : '\n' -> channel(HIDDEN) ;
ASTERISK : '*' ;
SIMPLE_IDENTIFIER : [a-zA-Z_] [a-zA-Z0-9_$]* ;
NUMBER : [0-9] [0-9_]* ;
and
parser grammar TParser;
options { tokenVocab=TLexer; }
seq_input_list :
level_input_list | edge_input_list ;
level_input_list :
( level_symbol_any )+ ;
edge_input_list :
( level_symbol_any )* edge_symbol ;
level_symbol_any :
{getCurrentToken().getText().matches("[0a]")}? ( NUMBER | SIMPLE_IDENTIFIER ) ;
edge_symbol :
SIMPLE_IDENTIFIER | ASTERISK ;
The input 0 *
is parsed fine but 0 f
is not recognized by the parser (no viable alternative at input 'f'). If I change the ordering of rules in seq_input_list, both inputs are recognized.
My question to you is, if this indeed is an ANTLR issue or I understand the usage of semantic predicates wrong. I would expect the input 0 f
to be recognized as (seq_input_list (edge_input_list (level_symbol_any ( NUMBER) edge_symbol ( SIMPLE_IDENTIFIER ) ) )
.
Thank you in advance!
Julian