3

I want to be able to parse int [] or int tokens.

Consider the following grammar:

TYPE    :   'int' AFTERINT;
AFTERINT:   '[' ']';

Of course it works, but only for int []. To make it work for int too, I changed AFTERINT to this (added an empty condition':

AFTERINT:   '[' ']' |
              |;

But now I get this warning and error:

[13:34:08] warning(200): MiniJava.g:5:9: Decision can match input such as "" using multiple alternatives: 2, 3

As a result, alternative(s) 3 were disabled for that input [13:34:08] error(201): MiniJava.g:5:9: The following alternatives can never be matched: 3

Why won't empty condition work?

c0dehunter
  • 6,412
  • 16
  • 77
  • 139

1 Answers1

4

The lexer cannot cope with tokens that match empty string. If you think about it for a moment, this is not surprising: after all, there are an infinite amount of empty strings in your input. The lexer would always produce an empty string as a valid token, resulting in an infinite loop.

The recognition of types does not belong in the lexer, but in the parser:

type
 : (INT | DOUBLE | BOOLEAN | ID) (OBR CBR)?
 ;

OBR     : '[';
CBR     : ']';
INT     : 'int';
DOUBLE  : 'double';
BOOLEAN : 'boolean';
ID      : ('a'..'z' | 'A'..'Z')+;

Whenever you start combining different type of characters to create a (single) token, it's usually better to create a parser rule for this. Think of lexer rules (tokens) as the smallest building block of your language. From these building blocks, you compose parser rules.

Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
  • Thank you Bart for the second time this week :) I was able to resolve the problem now and hope it will also help future ANTLR-help-seekers! – c0dehunter Nov 08 '12 at 15:17