2

Is it possible to get the "active" ANTLR rule from which a action method was called?

Something like this log-function in Antlr-Pseudo-Code which should show the start and end position of some rules without hand over the $start- and $end-tokens with every log()-call:

@members{
  private void log() {
    System.out.println("Start: " + $activeRule.start.pos +
                       "End: " + $activeRule.stop.pos);
  }
}

expr: multExpr (('+'|'-') multExpr)* {log(); }
    ;

multExpr
    : atom('*' atom)* {log(); }
    ;

atom: INT
    | ID {log(); }
    | '(' expr ')'
    ;
Sonson
  • 1,129
  • 1
  • 11
  • 14

2 Answers2

0

(for Antlr4)

I was googling on how to get the name of the active rule and found this post. After some more research, I have found how to do it :

    prog:   statement[this.getRuleNames() /* parser rule names */]* EOF
        ;
    
    statement [String[] rule_names]
        locals [String rule_name]
        @after { System.out.println("The statement is a " + $rule_name + " : `" + $text + "`"); }
        :   stmt_a[rule_names] {$rule_name = $stmt_a.rule_name;}
        ;   
    stmt_a [String[] rule_names] returns [String rule_name]
        :   'stmt_a' { $rule_name = rule_names[$ctx.getRuleIndex()]; }
        ;

A more general solution passes the context on to the surrounding rule, from which you can extract all informations about the last active rule.

File RuleName.g4 :

grammar RuleName;

prog
    @init {System.out.println("Last update 1026");}
    :   statement[this.getRuleNames() /* parser rule names */]* EOF
    ;

statement [String[] rule_names]
    locals [String rule_name, ParserRuleContext context]
    @after { $rule_name = rule_names[$context.getRuleIndex()];
             System.out.println("The statement is a " + $rule_name + " : `" + $text + "`" + " from " + $start + " to " + $stop); }
    :   stmt_a {$context = (ParserRuleContext)$stmt_a.context;}
    |   stmt_b {$context = (ParserRuleContext)$stmt_b.context;}
    |   stmt_c {$context = (ParserRuleContext)$stmt_c.context;}
    ;

stmt_a returns [Stmt_aContext context]
    :   'stmt_a' more { $context = $ctx; }
    ;

stmt_b returns [Stmt_bContext context]
    :   'stmt_b' more { $context = $ctx; }
    ;

stmt_c returns [Stmt_cContext context]
    :   'stmt_c' more { $context = $ctx; }
    ;
 
more
    :   ID+
    ;

ID : [A-Z] ;
WS : [ \t]+ -> channel(HIDDEN) ;
NL : [\r\n]+ -> skip ;

File input.txt :

stmt_c X Y Z
stmt_a A B C
stmt_b D E F

Execution :

$ export CLASSPATH=".:/usr/local/lib/antlr-4.9-complete.jar"
$ alias a4='java -jar /usr/local/lib/antlr-4.9-complete.jar'
$ alias grun='java org.antlr.v4.gui.TestRig'
$ a4 -no-listener RuleName.g4 
$ javac RuleName*.java
$ grun RuleName prog -tokens input.txt 
[@0,0:5='stmt_c',<'stmt_c'>,1:0]
[@1,6:6=' ',<WS>,channel=1,1:6]
[@2,7:7='X',<ID>,1:7]
[@3,8:8=' ',<WS>,channel=1,1:8]
[@4,9:9='Y',<ID>,1:9]
[@5,10:10=' ',<WS>,channel=1,1:10]
[@6,11:11='Z',<ID>,1:11]
...
[@21,39:38='<EOF>',<EOF>,4:0]
Last update 1026
The statement is a stmt_c : `stmt_c X Y Z` from [@0,0:5='stmt_c',<3>,1:0] to [@6,11:11='Z',<4>,1:11]
The statement is a stmt_a : `stmt_a A B C` from [@7,13:18='stmt_a',<1>,2:0] to [@13,24:24='C',<4>,2:11]
The statement is a stmt_b : `stmt_b D E F` from [@14,26:31='stmt_b',<2>,3:0] to [@20,37:37='F',<4>,3:11]
BernardK
  • 3,674
  • 2
  • 15
  • 10
0

No, there is no way to get the name of the rule the parser is currently in. Realize that parser rules are, by default, simply Java methods returning a void. From a Java method, you cannot find out the name of it at run-time after all (when inside of this method).

If you set output=AST in the options { ... } of your grammar, every parser rule creates (and returns) an instance of a ParserRuleReturnScope called retval: so you could use that for your purposes:

// ...

options {
  output=AST;
}

// ...

@parser::members{
  private void log(ParserRuleReturnScope rule) {
    System.out.println("Rule: "    + rule.getClass().getName() +  
                       ", start: " + rule.start +
                       ", end: "   + rule.stop);
  }
}

expr: multExpr (('+'|'-') multExpr)*    {log(retval);}
    ;

multExpr
    : atom('*' atom)*                   {log(retval);}
    ;

atom: INT
    | ID                                {log(retval);}
    | '(' expr ')'
    ;
// ...

This is however not a very reliable thing to do: the name of the variable may very well change in the next version of ANTLR.

Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
  • 1
    Thank you for your helpful answer! I will pass over the start and end-tokens to the log()-function in an @after-action. – Sonson Aug 31 '11 at 07:14
  • @Sonson, you're welcome. Yes, that is a better option (passing `getStart()` and `getEnd()` in a log-method in the `@after { ... }` block of the rule): less fragile than using the local variable `retval`. – Bart Kiers Sep 01 '11 at 11:45
  • I did the same - see http://stackoverflow.com/questions/12892122/how-to-merge-two-asts, but `stop` and `getTree()` always return null :-( – j3d Oct 18 '12 at 10:33
  • Things has changed since 2011. I have added an answer. – BernardK Jan 10 '21 at 20:25
  • @BernardK for sure things have changed, but not for ANTLR v3, AFAIK (which this question is about). – Bart Kiers Jan 10 '21 at 21:36