16

How can I access alternate labels in ANTLR4 while generically traversing a parse tree? Or alternatively, is there any way of replicating the functionality of the ^ operator of ANTLR3, as that would do the trick.

I'm trying to write an AST pretty printer for any ANTLR4 grammar adhering to a simple methodology (like naming productions with alternate labels). I'd like to be able to pretty print a term like 3 + 5 as (int_expression (plus (int_literal 3) (int_literal 5))), or something similar, given a grammar like the following:

int_expression 
    : int_expression '+' int_expression # plus
    | int_expression '-' int_expression # minus
    | raw_int                           # int_literal
    ;
raw_int
    : Int
    ;
Int : [0-9]+ ;

I am unable to effectively give names to the plus and minus productions, because pulling them out into their own production causes the tool to complain that the rules are mutually left-recursive. If I can't pull them out, how can I give these productions names?

Note 1: I was able to get rid of the + argument methodologically by putting "good" terminals (e.g., the Int above) in special productions (productions starting with a special prefix, like raw_). Then I could print only those terminals whose parent productions are named "raw_..." and elide all others. This worked great for getting rid of +, while keeping 3 and 5 in the output. This could be done with a ! in ANTLR3.

Note 2: I understand that I could write a specialized pretty printer or use actions for each production of a given language, but I'd like to use ANTLR4 to parse and generate ASTs for a variety of languages, and it seems like I should be able to write such a simple pretty printer generically. Said another way, I only care about getting ASTs, and I'd rather not have to encumber each grammar with a tailored pretty printer just to get an AST. Perhaps I should just go back to ANTLR3?

Chucky Ellison
  • 470
  • 6
  • 14

2 Answers2

0

I suggest implementing the pretty printer as a listener implementation with a nested visitor class to get the names of the various context objects.

private MyParser parser; // you'll have to assign this field
private StringBuilder builder = new StringBuilder();

@Override
public void enterEveryRule(@NotNull ParserRuleContext ctx) {
    if (!builder.isEmpty()) {
        builder.append(' ');
    }

    builder.append('(');
}

@Override
public void visitTerminalNode(@NotNull TerminalNode node) {
    // TODO: print node text to builder
}

@Override
public void visitErrorNode(@NotNull TerminalNode node) {
    // TODO: print node text to builder
}

@Override
public void exitEveryRule(@NotNull ParserRuleContext ctx) {
    builder.append(')');
}

protected String getContextName(@NotNull ParserRuleContext ctx) {
    return new ContextNameVisitor().visit(ctx);
}

protected class ContextNameVisitor extends MyParserBaseVisitor<String> {
    @Override
    public String visitChildren() {
        return parser.getRuleNames()[ctx.getRuleIndex()];
    }

    @Override
    public String visitPlus(@NotNull PlusContext ctx) {
        return "plus";
    }

    @Override
    public String visitMinus(@NotNull MinusContext ctx) {
        return "minus";
    }

    @Override
    public String visitInt_literal(@NotNull MinusContext ctx) {
        return "int_literal";
    }
}
Sam Harwell
  • 97,721
  • 20
  • 209
  • 280
  • I'm looking for a generic way to do this without writing a specialized pretty printer for each language. Is there no way to do this? From a user's perspective, I don't understand why there isn't, since the alternate labels are right there. – Chucky Ellison Feb 09 '15 at 04:26
0

The API doesn't contain a method to access the alternate labels.

However there is a workaround. ANTLR4 uses the alternate labels to generate java class names and those java classes can be accessed at run time.

For example: to access alternate labels in ANTLR4 while generically traversing a parse tree (with a listener) you can use the following function:

// Return the embedded alternate label between
// "$" and "Context" from the class name
String getCtxName(ParserRuleContext ctx) {
    String str = ctx.getClass().getName();
    str = str.substring(str.indexOf("$")+1,str.lastIndexOf("Context"));
    str = str.toLowerCase();
    return str;
}

Example use:

@Override
public void exitEveryRule(ParserRuleContext ctx) {
    System.out.println(getCtxName(ctx));
}
chris
  • 96
  • 5