0

I try to make a grammer with ANTLR with the following specifics.

It can parse an identifier like:

foo > bar > 67

where foo > bar is the identifier, because if > followed by a letter it contains to the identifier and else its a greater than operator.

And I it should parse things like

((a = 1) AND (b = 2)) OR (c = 3)

where the ( ) are necessary.

I'm really new to this topic and ANTLR and hope someone can help.

I'm currently have this grammer

 grammar testgrammer;

start   :   statement EOF;

statement
    :   operation  (AND operation)*;

operation
    :   '(' ID OPERATOR INT ')';

AND :   'AND';

OPERATOR:   '=' | '>';

ID  
  :  ('a'..'z'| 'A'..'Z')+ (WS '>' WS ('a'..'z' | 'A'..'Z')+)?
  ;

WS  
  :  ' '+ {skip();}
  ;

INT :   '0'..'9'+
    ;

but I can't figure out howto switch between the > in an id and the > as an operator.

Sebastian
  • 952
  • 1
  • 14
  • 41

1 Answers1

1

First off, that's a confusing thing to let: "foo > bar" be an identifier and "foo > 67" an expression.

Since you allow for spaces inside such an identifier, your lexer will trip over input like "foo > 67" because after "foo > " it will try to consume a letter but sees a digit. And the lexer will not backtrack from "foo > " because there is no single token that could be created from it (note that the lexer never gives up on characters it consumed!).

In order to handle this, you must make sure that the lexer can match " > " followed by some letters. You can do that using a syntactic predicate (the ( ... )=> part):

Id
 : IdPart ((Spaces? '>' Spaces? IdPart)=> Spaces? '>' Spaces? IdPart)*
 ;

SpaceChars
 : (Spaces | '\r' | '\n') {skip();}
 ;

fragment Digit  : '0'..'9';
fragment Letter : 'a'..'z' | 'A'..'Z';
fragment Spaces : (' ' | '\t')+;
fragment IdPart : Letter (Letter | Digit)*;

Note that you cannot use the rule SpaceChars inside Id because that rule invokes the skip() method.

Bart Kiers
  • 166,582
  • 36
  • 299
  • 288