ANTLR: Resolving code too large in a root grammar's static initializer

Question

Searching for solutions for my problem, I got this question, suggesting composite grammars to get rid of code too large. Problem there, I'm already using grammar imports, but when I further extend one of the imported grammars, the root parser grammar shows the error. Apparently, the problem lies in the many tokens and DFA definitions that ANTLR generates after analyzing the whole grammar. Is there a way/what is the suggested way to get rid of this problem? Is it scalable, i.e. does it not depend on the parts changed by the workaround being small enough?

EDIT: To make this clear (the linked question didn't make it clear): The code too large error is a compiler error on the generated parser code, to my understanding usually caused by a grammar so large that some code is larger than the limit of the java specification. In my case, it's the static initializer of the root parser class, which contains tons of DFA lookahead variables, all resulting in code in the initializer. So, Ideally, ANTLR should be able to split that up in the case that the grammar is too big/the user tells ANTLR to do it. Is there such an option?

(I have to admit, the asker of the linked question had an... interesting rule that caused his grammar to bloat up, and it may be my error here, too. But the possibility of this being not the grammar's author's error (in any large grammar) stands, so I see this as a valid, non-grammar specific ANTLR question)

EDIT END

My grammar parses "Magic the Gathering" rules text and is available here (git). The problem specifically appears when exchanging line 33 for 34-36 in this file. I use Maven and antlr3-maven-plugin for building, so ideally, the workaround is doable using the plugin, but if it's not, that's a smaller problem than the one I have now...

Thanks a lot and I hope I haven't overseen any obvious documentation that would help me.

Bart Kiers · Accepted Answer · 2011-11-06T06:36:51.617

1

The fragment keyword can only be used before lexer rules, not before parser rules as I see you do. First change that in all your grammars (I only looked at ObjectExpressions.g). It's unfortunate that ANTLR does not produce an error when you try it. But believe me: it's wrong, and might be causing (a part of) your problem(s).

Also, your rule from line 34-36:

qualities
  :  qualities0 
  |  qualities0 (COMMA qualities0)+ -> qualities0+ 
  |  qualities0 (Or qualities0)+    -> ^(Or qualities0+)
  ;

should be rewritten as:

qualities
  :  qualities0 (COMMA qualities0)* -> qualities0+ 
  |  qualities0 (Or qualities0)+    -> ^(Or qualities0+)
  ;

EDIT

So, Ideally, ANTLR should be able to split that up in the case that the grammar is too big/the user tells ANTLR to do it. Is there such an option?

No, there is no such option unfortunately. You'll have to divide the grammar into (even more) smaller ones.

edited Nov 06 '11 at 06:36

answered Nov 03 '11 at 20:26

Bart Kiers

166,582
36
299
288

hi! thanks for your tip. I actually did that refactoring, but haven't committed it yet. I also removed fragment as you suggested, but it didn't help with my problem. now that this cleanup is done, can you imagine a solution for the persistent problem? – Silly Freak Nov 04 '11 at 20:25
@SillyFreak, no, I can't suggest anything else with the info posted in your original question. And I'm sorry to say, but I'm not going to download/fork a repository and figure out what steps to take to try and generate parsers from certain grammars and see where things go wrong. I am willing to try and help you with things you can post here on SO though. So if you can post the offending grammar (or a part of it, if it's large) and explain which rule(s) are producing the error(s), perhaps I, or someone else, can help. That way, I can more easily reproduce the error(s). – Bart Kiers Nov 05 '11 at 08:12
I understand that you won't usually download a grammar (or any piece of source code) to answer a question, and I didn't mean for you to do so. I guess I should have been more specific with my error; the question I linked to didn't state it more precisely. I'll edit my question with the info. In short, I think (I'm not sure, of course) that It's a problem with your tool, not with my grammar, and I'm asking for a known workaround for that problem, not to fix my grammar – Silly Freak Nov 05 '11 at 22:25

ANTLR: Resolving code too large in a root grammar's static initializer

1 Answers1

EDIT