I am trying to use Antlr-maven-plugin @ 4.7.2 to parse some Java source code on my ubuntu22.04. I used the g4 files from Here and successfully generated the Parser and Lexer. The main function is as follows:
public static void main(String[] args) throws IOException {
InputStream inputStream = Files.newInputStream(Paths.get(
"/xxx/java-antler-parser/src/main/java/Test.java"));
Java8Lexer lexer = new Java8Lexer(CharStreams.fromStream(inputStream));
Java8Parser parser = new Java8Parser(new CommonTokenStream(lexer));
System.out.println(parser.expression());
}
The test file is just a java file with regular import and basic logic. Showing part of the file as follows:
import org.antlr.v4.runtime.Lexer;
import org.antlr.v4.runtime.ParserRuleContext;
import org.antlr.v4.runtime.atn.PredictionMode;
import java.io.File;
import java.lang.System;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.BrokenBarrierException;
import java.util.concurrent.CyclicBarrier;
class Test {
// public static long lexerTime = 0;
public static boolean profile = false;
public static boolean notree = false;
public static boolean gui = false;
public static boolean printTree = false;
...
However, there was no output except one error message:
line 1:0 extraneous input 'import' expecting {'boolean', 'byte', 'char', 'double', 'float', 'int', 'long', 'new', 'short', 'super', 'this', 'void', IntegerLiteral, FloatingPointLiteral, BooleanLiteral, CharacterLiteral, StringLiteral, 'null', '(', '!', '~', '++', '--', '+', '-', Identifier, '@'}
[]
The "expecting" list is far less than that defined in the g4 file. Meanwhile, when I searched the word "import" in the Parser, I could see tons of definitions including "import".
What I want is a parser that could provide me with the tokens' type and index such as
[@0,0:5='import',<IMPORT>,1:0],[@1,6:6=' ',<WHITESPACE>1:6], [@2,7:9='org',<Identifier>1:9]...
What should I do?