Here is a basic structure for simple nested expressions...
infix : prefix (INFIX_OP^ prefix)*;
prefix : postfix | (PREFIX_OP postfix) -> ^(PREFIX_OP postfix);
postfix : INT (POSTFIX_OP^)?;
POSTFIX_OP : '!';
INFIX_OP : '+';
PREFIX_OP : '-';
INT : '0'..'9'*;
If I wanted to create a list of these expressions I could use the following...
list: infix (',' infix)*;
Here we use the ',' as a delimiter.
I want to be able to build a list of expressions without any delimiter.
So if I have the string 4 5 2+3 1 6
I would like to be able to interpret that as (4) (5) ^(+ 2 3) (1) (6)
The problem is that both 4
and 2+3
have the same first symbol (INT) so I have a conflict. I'm trying to figure out how I can resolve this.
EDIT
I've almost figured it out, just having trouble coming up with the correct rewrite for a certain condition...
expr: (a=atom -> $a)
(op='+' b=atom-> {$a.text != "+" && $b.text != "+"}? ^($op $expr $b) // infix
-> {$b.text != "+"}? // HAVING TROUBLE COMING UP WITH THIS CORRECT REWRITE!
-> $expr $op $b)*; // simple list
atom: INT | '+';
INT : '0'..'9'+;
This will parse 1+2+3++4+5+
as ^(+ ^(+ 1 2) 3) (+) (+) ^(+ 4 5) (+)
, which is what I want.
Now I'm trying to finish my rewrite rule so that ++1+2
will parse as (+) (+) ^(+ 1 2)
.
Overall I want a list of tokens and to find all the infix expressions, and leave the rest as a list.