1

I have an ANTLR grammar like this:

grammar HelloGrammar1;

ID  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ;
STATEMENT : 'hello' ID ';' ;
WS  :   (' '|'\t'|'\r'|'\n')* ;

I want it to parse the following text: hello qwerty ;. It doesn work this way. If I change my string to helloqwerty;, everything is fine. I can also change grammar to:

grammar HelloGrammar2;

ID  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ;
STATEMENT : 'hello' WS ID WS ';' ;
WS  :   (' '|'\t'|'\r'|'\n')* ;

And in this case, hello qwerty ; works fine. Is it possible to make ANTLR skip whitespaces automatically? (i.e. - I want to make HelloGrammar1 work with hello qwerty ;)

Update

If it makes sense: I'm testing it in ANTLRWorks.

Update 2

Also tried this way:

grammar HelloGrammar;

ID  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ;
STATEMENT : 'hello' ID ';' ;
WS  :   (' '|'\t'|'\r'|'\n') { $channel = HIDDEN; } ;

Still doesn't work.

Update 3

I'm using "Interpreter" tab with "STATEMENT" rule selected.

Andrey Agibalov
  • 7,624
  • 8
  • 66
  • 111
  • what do want to be allowable input? – luketorjussen Sep 05 '11 at 16:54
  • Note that no lexer rule should produce a token that can (potentially) match zero characters (empty string). The lexer would produce an infinite amount of them. So `(' '|'\t'|'\r'|'\n')*` should be either `(' '|'\t'|'\r'|'\n')` or `(' '|'\t'|'\r'|'\n')+`. – Bart Kiers Sep 06 '11 at 07:13

1 Answers1

2

I think the issue may be that you should change STATEMENT (currently a lexer rule) to statement (a parser rule)

grammar HelloGrammar;

statement : 'hello' ID ';' ;
ID  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ;
WS  :   (' '|'\t'|'\r'|'\n') { $channel = HIDDEN; } ;

In ANTLRWorks this accepts:

hello qwerty;
hello   qwerty;
hello loki2302;
hello   qwerty  ;

but does not accept:

helloqwerty;
helloqwerty ;
hello;
hello qwerty
luketorjussen
  • 3,156
  • 1
  • 21
  • 38