1

I would like to use ANTLR4 to analyze COBOL files using a Python3 program. To do so, I would need to know the position on which the token (lets say a MOVE statement) occurs in the file (at least the line and if possible also the character position). I need this especially, because I want to resolve COPY statements (similar to #include <> in C) and make sure I know in which part I am, while parsing the file.

I already searched here and found similar questions, but they don't seem to work for the python implementation (anymore). E.g. this one: Previous Question about that topic, probably for Java

If I try this solution I get an error as soon as I call getStartIndex() or getSymbol() (AttributeError: 'StartRuleContext' object has no attribute 'getStartIndex').

jottbe
  • 4,228
  • 1
  • 15
  • 31
  • The principle is still the same. Look at the interface/source of your python ANTLR4 runtime to see how the exact naming is. – Mike Lischke Mar 04 '18 at 09:15
  • Runtime API's are, as Mike mentions, largely the same, but never 100%. Have a look at the Python3 runtime: https://github.com/antlr/antlr4/tree/master/runtime/Python3/src/antlr4 – Bart Kiers Mar 04 '18 at 09:44

2 Answers2

1

It seems that only objects of type TerminalNodesImpl contain the required info. I came up with the following code, although I am not so happy that I have to use instanceof() to check if I have the right node type. If somebody has a cleaner way to get to the info, please let me know.

class Cobol85PreprocessorPrintListener(Cobol85PreprocessorListener):
    def enterEveryRule(self, ctx):
        print("<< enterEveryRule type ", type(ctx))
        terminal= ctx
        depth= 0
        while terminal is not None and not isinstance(terminal, tree.Tree.TerminalNode):
            terminal= terminal.getChild(0)
            depth+= 1
        if terminal is not None:
            symbol= terminal.getSymbol()
            print('\tThe info was found in depth %d here:' % depth)
            self.printSymbolDetails(symbol, '\t\t')

    def printSymbolDetails(self, symbol, indent='\t'):
        print(indent + 'symbol=', symbol)
        print(indent + 'text=  ', symbol.text)
        print(indent + 'start= ', symbol.start)
        print(indent + 'stop=  ', symbol.stop)
        print(indent + 'line=  ', symbol.line)
        print(indent + 'column=', symbol.column)
jottbe
  • 4,228
  • 1
  • 15
  • 31
1

A simple and more correct way is to use visitTerminal method. ANTLR listener provides this method called when a terminal node is visited. You can then obtain your required info form the token.

    def visitTerminal(self, node: TerminalNode):
       terminal = node.getSymbol()
        self.print_symbol_detail(terminal)

    def print_symbol_detail(self, terminal, indent='\t'):
       print(indent + 'symbol=', terminal)
       print(indent + 'text=  ', terminal.text)
       print(indent + 'type=  ', terminal.type)
       print(indent + 'start= ', terminal.start)
       print(indent + 'stop=  ', terminal.stop)
       print(indent + 'line=  ', terminal.line)
       print(indent + 'column=', terminal.column)
       print('-'*75)
Morteza Zakeri
  • 166
  • 1
  • 7