This is a follow up question to this question answered perfectly by Bart
My goal is possibly to get specific lines for either "generic script lines" or "lines inside a function body", ideally discarding whitespace, but still get any lines outside of the <%
and %>
tags in bulk. I came up with a solution, but looking at the tree it just seems messy.
Here is my lexer:
lexer grammar CmScriptLexer;
//Whitespace: Spaces -> channel(HIDDEN);
ScriptStart : '<%' (Spaces)* -> mode(Script);
SpacesPlain : [\r\n]+ -> skip;
GenericText : . ;
mode Script;
ScriptEnd : '%>' -> mode(DEFAULT_MODE);
Comment : '\'' ~[\r\n]* -> skip;
Function : 'function' -> mode(FunctionDeclaration);
NL : [\r\n]+;
ScriptText : . ;
mode FunctionDeclaration;
FunctionComment : '\'' ~[\r\n]* -> skip;
FunctionName : Id;
DeclarationSpaces : Spaces+ -> skip;
OPar : '(' -> mode(FunctionParameter);
mode FunctionParameter;
FunctionParameterComment : '\'' ~[\r\n]* -> skip;
ParameterName : Id;
ParameterSpaces : Spaces+ -> skip;
Comma : ',';
CPar : ')' -> mode(InFunction);
mode InFunction;
FunctionBodyComment : '\'' ~[\r\n]* -> skip;
EndFunction : 'end' Spaces 'function' -> mode(Script);
FunctionLine : ~[ \r\n]+;
FunctionSpaces : Spaces+;
//FunctionText : . ;
fragment Spaces : [ \r\n\t]+;
fragment Id : [a-zA-Z0-9_\u0080-\ufffe]+;
and my parser:
parser grammar CmScriptParser;
options { tokenVocab=CmScriptLexer; }
file
: block* EOF
;
block
: plainText
| ScriptStart script* ScriptEnd
;
plainText
: GenericText+ NL*
;
script
: simpleScript NL*
| function NL*
;
simpleScript
: ScriptText+
;
function
: Function FunctionName OPar parameters? CPar functionBody EndFunction
;
functionBody
: functionLines+
;
functionLines
: FunctionSpaces* functionLine FunctionSpaces*
;
functionLine
: FunctionLine+
;
parameters
: ParameterName ( Comma ParameterName )*
;
and finally what I'm using as a test case:
foo
bar
<%
line 1
line 2
function x(y)
spanning
multiple
lines
end function
function a(b) no newlines end function
%>
baz
My issue is it seems really verbose and I fear my "solution" while with the test case is just poorly laid out and I'm maybe overthinking rules.
Any suggestion on how to improve? All I want is trimmed "line" elements so matching something like \n \n\n\tscript line \n\n\t\n
being resulted in a line of just script line
is ideal.
EDIT: adding what I think is an example of what I am after, again, maybe not expressing the best way possible:
simpleScript:
scriptLine: line1
scriptLine: line2
function:
name: x
parameters:
paramter: y
body:
functionLine: spanning
functionLine: multiple
functionLine: lines
function:
name: a
parameters:
paramter: b
body:
functionLine: no newlines
The goal in the end is when walking the tree, I can make a new "function call object", and call stuff like
script = new Script() // on script "enter"
script.addLine("line 1")
script.addLine("line 2")
program.addNode(script) // on script "exit"
...
function = new Function() // on function "enter"
function.setName("y") // on "function"?
...
function.addParameter("a") // on "parameter"
...
function.addBodyLine("spanning") // on "line" ??
function.addBodyLine("multiple")
function.addBodyLine("lines")
...
program.addFunctionDeclaration(function) // on function "exit" once complete