1

I have a problem with my antlr grammar or(lexer). In my case I need to parse a string with custom text and find functions in it. Format of function $foo($bar(3),'strArg'). I found solution in this post ANTLR Nested Functions and little bit improved it for my needs. But while testing different cases I found one that brakes parser: $foo($3,'strArg'). This will throw IncorectSyntax exception. I tried many variants(for example not to skip $ and include it in parsing tree) but it all these attempts were unsuccessfully

Lexer

lexer grammar TLexer;

TEXT
 : ~[$]
 ;

FUNCTION_START
 : '$' -> pushMode(IN_FUNCTION), skip
 ;

mode IN_FUNCTION;
  FUNTION_NESTED : '$' -> pushMode(IN_FUNCTION), skip;
  ID             : [a-zA-Z_]+;
  PAR_OPEN       : '(';
  PAR_CLOSE      : ')' -> popMode;
  NUMBER         : [0-9]+;
  STRING         : '\'' ( ~'\'' | '\'\'' )* '\'';
  COMMA          : ',';
  SPACE          : [ \t\r\n]-> skip;

Parser


options {
  tokenVocab=TLexer;
}

parse
 : atom* EOF
 ;

atom
 : text
 | function
 ;

text
 : TEXT+
 ;

function
 : ID params
 ;

params
 : PAR_OPEN ( param ( COMMA param )* )? PAR_CLOSE
 ;

param
 : NUMBER
 | STRING
 | function
 ;
  • `3` is not a correct function name (which is defined as `ID`, i.e. `[a-zA-Z_]+`). – Piotr P. Karwasz Nov 28 '19 at 21:57
  • I'm sorry for disinformation. I mean that this is `$foo($3)` incorrect case, but parser doesn't fail at `$3` (for parser it will seems as `NUMBER` nonterminal),it will fail if you add some more text `$foo($3) and some more text`. As I undestand this happens because `$` skipped and you still in `IN_FUNCTION` mode – Alexey Kryuchkov Nov 28 '19 at 22:06

1 Answers1

0

The parser does not fail on $foo($3,'strArg'), because when it encounters the second $ it is already in IN_FUNCTION mode and it is expecting a parameter. It skips the character and reads a NUMBER.

If you want it to fail you need to unskip the dollar signs in the Lexer:

FUNCTION_START : '$' -> pushMode(IN_FUNCTION);

mode IN_FUNCTION;
    FUNTION_START : '$' -> pushMode(IN_FUNCTION);

and modify the function rule:

function : FUNCTION_START ID params;
Piotr P. Karwasz
  • 12,857
  • 3
  • 20
  • 43