-1

I have an Antlr3-generated Java lexer/parser for this MySQL grammar.

Eclipse Neon can compile the above lexer/parser just fine. The resulting program too runs just fine.

However, the Oracle Java 8 compiler gives a 'code too large' error for the same Java code:

[javac] .../MySQLParser.java:27: error: code too large
[javac]     public static final String[] tokenNames = new String[] {
[javac]                                  ^
[javac] 1 error

Questions:

  1. If the Java code being compiled were really too large (> 64K), shouldn't have both compilers yielded the same error?

  2. If it is the bytecode that is too large, is there any way to get the Oracle compiler to generate a space-efficient code similar to Eclipse's?

    For code maintenance reasons, I'd prefer neither to edit the original MySQL grammar I got from the Antlr site, nor edit the generated Java parser/lexer code to make it small enough for Oracle. So, would prefer some "meta", compiler-level workaround.

  3. Does the above error message mean that the problem is with just tokenNames being too big? Or, is it the overall class that it's a member of that's too big?

MT0
  • 143,790
  • 11
  • 59
  • 117
Harry
  • 3,684
  • 6
  • 39
  • 48
  • Neon?? Oh dear, I'm still on Helios – Nick is tired May 15 '17 at 07:13
  • 1
    Note that the Eclipse compiler is a completely different codebase than `javac`. As long as the generated output is correct this usually doesn't matter. The Eclipse compiler can be used as a stand-alone replacement for javac - you may therefore want to consider explicitly using it to compile this file . – Thorbjørn Ravn Andersen May 15 '17 at 07:42
  • 1
    I don't know if this helps with the code size problem, but I strongly recommend that you use a good grammar. The one from the grammar repository is incomplete and misses many language parts. Instead use the 100% complete grammar from [MySQL Workbench](https://github.com/mysql/mysql-workbench/tree/master/library/parsers/grammars). – Mike Lischke May 16 '17 at 07:17
  • Wish I had found that link earlier. – Harry May 20 '17 at 11:17

1 Answers1

2

tokenNames alone is unlikely to produce code too large. What happens is that all the variable initialization in the class (including the DFA initialization that tends to be very large) are compiled into a single static initializer, which behaves (in some ways) like an ordinary method and has this code size limit.

What we actually did when we ran into this problem (with an even larger grammar though) was modify the template that generates the Java parser so that instead of things like

public static final String[] tokenNames = new String[] { ... };

it produces code like

public static String[] tokenNames;
static {
    tokenNames = new String[] { ... };
}

(we actually did this for the DFA initialization, not tokenNames, since that is where we had the problem)

This involves modifying the ANTLR tool - perhaps not the simplest solution for your case where Eclipse is still able to compile it. On a plus side, it is a text-only modification that only requires a zip/unzip tool and a text editor, plus learning how ANTLR works inside.

Jiri Tousek
  • 12,211
  • 5
  • 29
  • 43