1

I have an antlr4 grammar file that parses a BASIC language. Is there a way to insert more code in my extended baseListener class?

For example, if I am parsing this code:

10 print "hello world"
   %include "moreCode.bas"
   print "after include"

moreCode.bas could be something like:

for t% = 1% to 10%
   print t%
next t%

I need to detect the include command and include the contents into the file being walked and continue walking it as a whole.

So I was thinking that in my enterIncludeCommand method in my listener class I would start a new parser for moreCode.bas and then somehow insert the tokens/contexts into my current one.

What is the correct way of doing this?

tmakaro
  • 33
  • 5

2 Answers2

0

There is no one right pattern. That said, one effective way is to have your main initiate the parser by always calling through a constructor that takes a state object and a source path as parameters

public class BasicParser {
    public static void main(String[] args) {
        ...
        StateModel state = new StateModel()
        RecurseParser rp = new RecurseParser(state, pathname);
        ...
    }
}

public class RecurseParser {

    public RecurseParser(StateModel state, String pathname) {
        this.state = state;
        this.pathname = pathname;  // source text to parse
        ...
    }

    public StateModel getResults() {
        return this.state
}

In your enterIncludeStatement method, create and run a new RecurseParser instance directly. In your exitIncludeStatement, retrieve the new current state and, as appropriate, validate/check for errors.

Since the state model encapsulates your symbol table, etc., you maintain continuity as you walk through the forest -- recursion is effectively free.

Should mention that, relative to the symbol table, treat executing an include essentially the same as calling a subroutine.

Related: Symbol Table

Community
  • 1
  • 1
GRosenberg
  • 5,843
  • 2
  • 19
  • 23
  • A very interesting idea. I'll have to think about how that will work for me since I also need to make a few passes. I was thinking about having the enterIncludeStatement in my Phase1Listener(for lack of a better name) to parse the include and inject it into the current tree. Then phase2 can just use the current tree to walk and no longer has to worry about the includes. Your idea would work great if I also wanted to keep those includes separate. – tmakaro Oct 29 '14 at 06:14
  • Trying to merge parse trees will be problematic if only considering error reporting -- line numbers will be off. If the includes are just small snippets of code, might make more sense to preprocess the include strings into the main source string prior to the initial lexing/parsing of your program. – GRosenberg Oct 29 '14 at 06:50
0

I have two solutions for this and I took the last one I am going to mention. Also GRosenBerg has a great idea too.

1) use the TokenStreamRewriter and in the enterIncludeStatement use the rewriter insertBefore, insertAfter and/or replace methods. At the end of the walk of that particular listener object, call the rewriter getText() and that will give you the combined string. You will have to reparse that text to go the next listener pass.

2) In the enterIncludeStatement method of the listener class, get the include file name, run the lexer/parser on it and then take the first StatementContext(in my case) and inject it into the current tree using the IncludeContext.AddChile(myStatement). Looping for each statement line in that include file. The tricky part is to include the statements in the correct place but you will end up with a complete tree that you can walk with the next listener class phase.

I used option 2 and its working for me so far however I'm not sure using the addChild method is the best way since I am really inserting siblings not children. Given this siblings/childrens issue then maybe grosenberg's recursive idea would be the best.

tmakaro
  • 33
  • 5