I am newbie to Antlrworks. I am writing combined grammar file to parse XML file. XML file is pretty big and complex.
There are many lexer rules defined in grammar. Antlrworks 1.4.3 is generating code without any problem. But when i debug the code using it is generating the following error.
[13:29:42] D:\Antlr\Grammer Files\output\OrigionalSampleCDFXMLLexer.java:6472: code too large
[13:29:42] public int specialStateTransition(int s, IntStream _input) throws NoViableAltException {
[13:29:42] ^
[13:29:42] 1 error.
Below are lexer rules defined in my combined grammar file:
DATEFORMATE : DIGIT DIGIT DIGIT DIGIT '-' DIGIT DIGIT '-' DIGIT DIGIT;
TIMEFORMATE : 'T' ( DIGIT DIGIT ':' DIGIT DIGIT ':' DIGIT DIGIT );
CATEGORY_SW_CS_COLLECTION : 'FEATURE' | 'COLLECTION'; // These are fixed
CATEGORY_SW_INSTANCE : 'VALUE' | 'DEPENDENT_VALUE' | 'BOOLEAN' |'ASCII' | 'VAL_BLK' | 'CURVE' |
'MAP' | 'STRUCTURE' | 'UNION' |
'VALUE_ARRAY' | 'CURVE_ARRAY' |'MAP_ARRAY' | 'STRUCTURE_ARRAY';
CATEGORY_SW_AXIS_CONT : 'FIX_AXIS' | 'STD_AXIS' ;
CATEGORY_COMMON_IN_AXIS_INSTANCE
: 'CURVE_AXIS' |'RES_AXIS' | 'COM_AXIS' ;
CATEGORY_SW_INSTANCE_TREE : 'VCD' | 'NO_VCD' ;
CATEGORY_MSRSW : 'CDF20' ;
FLAG_VALUES
: 'TRUE' | 'FALSE';
ATTR_EQ : {tagMode}? => '=' ;
PCDATA : {!tagMode}? => (~'<')* ;
//NMTOKENS: {tagMode}? => ( '\"' (NMTOKEN ' ')* '\"' | '\''(NMTOKEN ' ')* '\'') ;
NMTOKEN : {tagMode}? => ( '\"' NMTOKEN_CHAR* '\"' | '\''NMTOKEN_CHAR* '\'');
ID : {tagMode}? => ( '\"' LETTER (LETTER | DIGIT | '_' )* '\"'
| '\'' LETTER (LETTER | DIGIT | '_' )* '\''
)
;
CDATA :
{tagMode}? => ( '\"' (~('\"\'&<>'))* '\"'
| '\'' (~('\"\'&<>'))* '\''
)
;
TAG_START_OPEN : '<' {tagMode = true;};
TAG_END_OPEN : '</' {tagMode = true;};
TAG_CLOSE : {tagMode}? => '>' {tagMode = false;};
TAG_EMPTY_CLOSE : {tagMode}? => '/>' {tagMode = false;};
fragment NMTOKEN_CHAR: (LETTER | DIGIT | '_' | '-' | '.' | ':');
fragment LETTER : 'A'..'Z' | 'a'..'z' | 'ü';
//fragment Exponent : ('e'|'E') ('+'|'-')? (DIGIT)+ ;
fragment DIGIT : '0'..'9';
WS : {tagMode}? => (' ' | '\t'| '\r' | '\n')+ {$channel=99;} ;
And off course i have parser rules in the same file;-).
Correcting lexer rules by replacing most of '+' by '*' didn't not work.
Is something wrong with lexer rules????
Another Question:
Tried moving some of the lexer rules from combined grammar file to another lexer grammar file. In this case importing lexer grammar to combined grammar is giving problem. It says 'Lexer file name' is undefined with the fix idea 'create the grammar file'.
grammar SampleCDFXML;
options {
language = Java;
output=AST;
tokenVocab=XMLBaseLexer;
}
import XMLBaseLexer ; // Here it says undefined import "XMLBaseLexer"
'XMLBaseLexer' is lexer grammar which has some of the lexer rules from original combined grammar.
I searched for import problems in many websites but didn't get answer.
Please someone give ideas to solve the problems.
Any help is very much appreciated.
Thank you!