1

Consider two ANTLR grammars in a directory structure like the following:

antlr-4.12.0-complete.jar
generated-code/
grammars/
  com/
    example/
      version1/
        Grammar.g4
      version2/
        Grammar.g4

The idea is to keep around a parser for each version of a grammar for backwards compatibility. The grammars have the same name, but are not identical and are in different packages:

Grammar.g4 (version 1)
----------------------
grammar Grammar;

@header {
package com.example.version1;
}

start: 'abc'.*?'def';
Grammar.g4 (version 2)
----------------------
grammar Grammar;

@header {
package com.example.version2;
}

start: 'qrs'.*?'tuv';

A command like

grammars/> java -jar ../antlr-4.12.0-complete.jar -o ../generated-code ./com/example/version1/Grammar.g4 ./com/example/version2/Grammar.g4

results in the following:

generated-code/
  com/
    example/
      version1/
        GrammarLexer.java
        GrammarParser.java
        ...

There is no version2. The cause appears to be in ANTLR's Tool class, in

public List<GrammarRootAST> sortGrammarByTokenVocab(List<String> fileNames)

where grammars are collected by name only. The list of filenames contains two grammars, but the return value contains only one AST.

Obviously, there are several easy workarounds - (1) give the grammars different names, (2) run the tool twice, or (3) make one grammar that can handle all the versions. I can do (1), which makes this question rather low-priority. I can't easily do (2) because I'm using the ANTLR gradle plugin, which operates on all grammars at once (and because the file ordering is not guaranteed, I randomly get generated code for version 1 or version 2, but never both). I can't do (3) because these ANTLR grammars are not hand-written, but generated from a multi-thousand-line proprietary grammar with a non-ANTLR syntax, which has changed significantly over the years.

Should this be considered an ANTLR bug, since the grammars are different and in different packages? Is it user error to supply the tool with multiple unrelated grammars with the same name in a single invocation (in which case the gradle plugin is also making that error, though reasonably so)?

  • Outside of Gradle, the tool works fine for this scenario: just make two calls to the tool with different `-o` and `-lib` options. (It would be best if your code generator should not be hardwiring the package name via `@header`. It should use the `-package` option on the tool call instead.) One call to the tool cannot possibly work because the .tokens files collide, and duplicate options to the tool have a "last one wins" semantics. The problem is likely that this is not a scenario the plugin can handle. – kaby76 Apr 08 '23 at 23:50

0 Answers0