1

env: antlr 4.7.1

the grammer is:

grammar Whilelang;
program : seqStatement;
seqStatement: statement (';' statement)* ;
statement: ID ':=' expression                          # attrib
     | 'print' Text                                # print
     | '{' seqStatement '}'                        # block
     ;
expression: INT                                        # int
      | ID                                         # id
      | expression ('+'|'-') expression            # binOp
      | '(' expression ')'                         # expParen
      ;
bool: ('true'|'false')                                 # boolean
    | expression '=' expression                        # relOp
    | expression '<=' expression                       # relOp
    | 'not' bool                                       # not
    | bool 'and' bool                                  # and
    | '(' bool ')'                                     # boolParen
;
INT: ('0'..'9')+ ;
ID: ('a'..'z')+;
Text: '"' .*? '"';
Space: [ \t\n\r] -> skip;

The input language code are:

a := 1
b := 2

According to the grammar, Antlr4 should output a error --" expect ';' at line 1 " for the above input language code. But in fact. no error ouputted, It seems the grammar accept only partial input, and didn't consume all input tokens. Is it a bug of antlr4?

$ grun Whilelang program -trace
a := 1
b := 2
^d
enter   program, LT(1)=a
enter   seqStatement, LT(1)=a
enter   statement, LT(1)=a
consume [@0,0:0='a',<17>,1:0] rule statement
consume [@1,2:3=':=',<2>,1:2] rule statement
enter   expression, LT(1)=1
consume [@2,5:5='1',<16>,1:5] rule expression
exit    expression, LT(1)=b
exit    statement, LT(1)=b
exit    seqStatement, LT(1)=b
exit    program, LT(1)=b

1 Answers1

1

Not a bug. ANTLR is doing exactly what it was asked to do.

Given the rules

program : seqStatement;
seqStatement: statement (';' statement)* ;

the program rule is then entirely complete when at least one statement has been matched. Since the parser cannot validly match another statement -- optional per the grammar-- it stops.

Changing to

program : seqStatement EOF;

requires the program rule to match statements until it can also match an EOF token (the lexer automatically adds an EOF at the end of the source text). This likely the behavior you are looking for.

GRosenberg
  • 5,843
  • 2
  • 19
  • 23