Description
I'm trying to create a custom language that I want to separate lexer rules from parser rules. Besides, I aim to divide lexer and parser rules into specific files further (e.g., common lexer rules, and keyword rules).
But I don't seem to be able to get it to work.
Although I'm not getting any errors while generating the parser (.java
files), grun
fails with Exception in thread "main" java.lang.ClassCastException
.
Note
I'm running ANTLR4.7.2
on Windows7 targeting Java.
Code
I created a set of files that closely mimic what I intend to achieve. The example below defines a language called MyLang
and separates lexer and parser grammar. Also, I'm splitting lexer rules into four files:
// MyLang.g4 parser grammar MyLang; options { tokenVocab = MyLangL; } prog : ( func )* END ; func : DIR ID L_BRKT (stat)* R_BRKT ; stat : expr SEMICOLON | ID OP_ASSIGN expr SEMICOLON | SEMICOLON ; expr : expr OPERATOR expr | NUMBER | ID | L_PAREN expr R_PAREN ;
// MyLangL.g4 lexer grammar MyLangL; import SkipWhitespaceL, CommonL, KeywordL; @header { package com.invensense.wiggler.lexer; } @lexer::members { // place this class member only in lexer Map<String,Integer> keywords = new HashMap<String,Integer>() {{ put("for", MyLangL.KW_FOR); /* add more keywords here */ }}; } ID : [a-zA-Z]+ { if ( keywords.containsKey(getText()) ) { setType(keywords.get(getText())); // reset token type } } ; DIR : 'in' | 'out' ; END : 'end' ;
// KeywordL.g4 lexer grammar KeywordL; @lexer::header { // place this header action only in lexer, not the parser import java.util.*; } // explicitly define keyword token types to avoid implicit def warnings tokens { KW_FOR /* add more keywords here */ }
// CommonL.g4 lexer grammar CommonL; NUMBER : FLOAT | INT | UINT ; FLOAT : NEG? DIGIT+ '.' DIGIT+ EXP? | INT ; INT : NEG? UINT+ ; UINT : DIGIT+ EXP? ; OPERATOR : OP_ASSIGN | OP_ADD | OP_SUB ; OP_ASSIGN : ':='; OP_ADD : POS; OP_SUB : NEG; L_BRKT : '[' ; R_BRKT : ']' ; L_PAREN : '(' ; R_PAREN : ')' ; SEMICOLON : ';' ; fragment EXP : [Ee] SIGN? DIGIT+ ; fragment SIGN : POS | NEG ; fragment POS: '+' ; fragment NEG : '-' ; fragment DIGIT : [0-9];
// SkipWhitespaceL.g4 lexer grammar SkipWhitespaceL; WS : [ \t\r\n]+ -> channel(HIDDEN) ;
Output
Here is the exact output I receive from the code above:
ussjc-dd9vkc2 | C:\M\w\s\a\l\example
§ antlr4.bat .\MyLangL.g4
ussjc-dd9vkc2 | C:\M\w\s\a\l\example
§ antlr4.bat .\MyLang.g4
ussjc-dd9vkc2 | C:\M\w\s\a\l\example
§ javac *.java
ussjc-dd9vkc2 | C:\M\w\s\a\l\example
§ grun MyLang prog -tree
Exception in thread "main" java.lang.ClassCastException: class MyLang
at java.lang.Class.asSubclass(Unknown Source)
at org.antlr.v4.gui.TestRig.process(TestRig.java:135)
at org.antlr.v4.gui.TestRig.main(TestRig.java:119)
ussjc-dd9vkc2 | C:\M\w\s\a\l\example
§