2

I wrote a grammar for a language and now I want to treat some syntactic sugar constructions, for that I was thinking of writing a template translator.

The problem is I want my template grammar to translate only some constructions of the language and leave the rest as it is.

For example:

I have this as input:

class Main { 
   int a[10];
}

and I want to translate that into something like:

class Main { 
   Array a = new Array(10);
}

Ideally I would like to do some think like this in ANTLR

grammer Translator
options { output=template;}

    decl 
         : TYPE  ID '[' INT ']' -> template(name = {$ID.text}, size ={$INT.text}) 
              "Array <name> = new Array(<size>);

I would like it to leave the rest of the input that doesn't match rule decl as it is.

How can I achieve this in ANTLR without writing the full grammar for the language ?

John Retallack
  • 1,498
  • 2
  • 19
  • 31

1 Answers1

2

I would simply handle such things in the parser grammar.

Assuming you're constructing an AST in your parser grammar, I guess you'll have a rule to parse input like Array a = new Array(10); similar to:

decl
  :  TYPE ID '=' expr ';' -> ^(DECL TYPE ID expr)
  ;

where expr eventually matches a term like this:

term
  :  NUMBER
  |  'new' ID '(' (expr (',' expr)*)? ')' -> ^('new' ID expr*)
  |  ...
  ;

To account for your short-hand declaration int a[10];, all you have to do is expand decl like this:

decl
  :  TYPE ID '=' expr     ';' -> ^(DECL TYPE    ID expr)
  |  TYPE ID '[' expr ']' ';' -> ^(DECL 'Array' ID ^(NEW ARRAY expr))
  ;

which will rewrite the input int a[10]; into the following AST:

enter image description here

which is exactly the same as the AST created for input Array a = new Array(10);.

EDIT

Here's a small working demo:

grammar T;

options {
  output=AST;
}

tokens {
  ROOT;
  DECL;
  NEW='new';
  INT='int';
  ARRAY='Array';
}

parse
  :  decl+ EOF -> ^(ROOT decl+)
  ;

decl
  :  type ID '=' expr     ';' -> ^(DECL type  ID expr)
  |  type ID '[' expr ']' ';' -> ^(DECL ARRAY ID ^(NEW ARRAY expr))
  ;

expr
  :  Number
  |  NEW type '(' (expr (',' expr)*)? ')' -> ^(NEW ID expr*)
  ;

type
  :  INT
  |  ARRAY
  |  ID
  ;

ID     : ('a'..'z' | 'A'..'Z')+;
Number : '0'..'9'+;
Space  : (' ' | '\t' | '\r' | '\n') {skip();};

which can be tested with the class:

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
  public static void main(String[] args) throws Exception {
    String src = "Array a = new Array(10); int a[10];";
    TLexer lexer = new TLexer(new ANTLRStringStream(src));
    TParser parser = new TParser(new CommonTokenStream(lexer));
    CommonTree tree = (CommonTree)parser.parse().getTree();
    DOTTreeGenerator gen = new DOTTreeGenerator();
    StringTemplate st = gen.toDOT(tree);
    System.out.println(st);
  }
}
Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
  • I tried that, but I when I try to pass a string in the rewrite rule it would issue an error: reference to undefined token in rewrite rule: 'Array' and same for 'new' – John Retallack Nov 03 '11 at 18:40
  • Actually the syntax is TOKEN["literal"] so I need to do something like: TYPE ID '[' expr ']' ';' -> ^(DECL TYPE["Array"] ID ^(NEW TYPE["Array"] expr)) – John Retallack Nov 03 '11 at 19:00
  • @JohnRetallack, yeah, I made a mistake: I used a literal token in a rewrite rule that wasn't present in the parser rule. I fixed it, and added a working example. – Bart Kiers Nov 03 '11 at 19:13
  • @JohnRetallack, yes, well done, that `T['text']` might work too. – Bart Kiers Nov 03 '11 at 19:14