Antlr Eclipse IDE White Space not being skipped

Question

I apologize in advance if this question has already been asked, can't seem to find it.

I'm just beginning with Antlr, using the antlr4IDE for Eclipse to create a parser for a small subset of Java. For some reason, unless I explicitly state the presence of a white space in my regex, the parser will throw an error.

My grammar:

grammar Hello;


r  : 
    (Statement ';')+  
    ;         


Statement: 
    DECL | INIT 
    ;

DECL: 
    'int' ID 
    ; 

INIT: 
    DECL '=' NUMEXPR 
    ;

NUMEXPR : 
    Number OP Number | Number 
    ;

OP : 
      '+' 
    | '-' 
    | '/' 
    | '*' 
    ; 

WS  :  
    [ \t\r\n\u000C]+ -> skip
    ;

Number: 
    [0-9]+ 
    ;

ID : 
    [a-zA-Z]+ 
    ;

When trying to parse

    int hello = 76;

I receive the error:

 Hello::r:1:0: mismatched input 'int' expecting Statement
 Hello::r:1:10: token recognition error at: '='

However, when I manually add the token WS into the rules, I receive no error.

Any ideas where I'm going wrong? I'm new to Antlr, so I'm probably making a stupid mistake. Thanks in advance.

EDIT : Here is my parse tree and error log:

Error Log:

score 1 · Accepted Answer · answered Mar 20 '17 at 08:23

1

Change syntax like this.

grammar Hello;
r         : (statement ';')+ ;         
statement : decl | init ;
decl      : 'int' ID  ; 
init      : decl '=' numexpr ;
numexpr   : Number op Number | Number ;
op        : '+' | '-' | '/' | '*' ; 
WS        : [ \t\r\n\u000C]+ -> skip ;
Number    : [0-9]+ ;
ID        : [a-zA-Z]+ ;

answered Mar 20 '17 at 08:23

I'm still having the same error, even after copying exactly what you wrote. I posted the parse tree in the original post – Slavvio Mar 20 '17 at 15:07
Nevermind, it worked, thank you! Follow up question: do you happen to know why capitalizing the symbols made a difference? – Slavvio Mar 20 '17 at 15:14
See [Grammar Lexicon](https://github.com/antlr/antlr4/blob/4.6/doc/lexicon.md). Token names always start with a capital letter. Parser rule names always start with a lowercase letter. – Mar 20 '17 at 22:02

score 0 · Answer 2 · edited May 23 '17 at 11:46

0

After looking at the documentation on antlr4, it seems like you have to have a specification for all of the character combinations that you expect to see in your file, from start to finish - not just those that you want to handle.

In that regards, it's expected that you would have to explicitly state the whitespace, with something like:

WS : [ \t\r\n]+ -> skip;

That's why the skip command exists:

A 'skip' command tells the lexer to get another token and throw out the current text.

Though note that sometimes this can cause a little trouble such as in this post.

edited May 23 '17 at 11:46

Community

1
1

answered Mar 20 '17 at 04:05

Addison

7,322
2
39
55

The OP has posted his grammar and he IS skipping the whitespace. – Alexander Rossa Sep 05 '17 at 00:05

Antlr Eclipse IDE White Space not being skipped

2 Answers2