I'm trying to build a recursive descent parser that builds an AST as it goes. The language is fairly simple;
x >/</= y #x, y can be any letter/number, matched by regex.
Then, AND/OR
with opening/closing brackets follow and everything repeats itself. The brackets are always where they should be, x/y are always valid.
My code:
private Node parseLogic(Node lhs) {
String curr = token.curr();
if (curr.equals("(")) {
leftCount++;
token.progress(1);
return parseExpression();
}
if (curr.equals("AND ")) { // we want to get the rhs expression so we have to reparse
token.progress(1);
Node rhs = parseExpression();
return new And(lhs, rhs);
}
if (curr.equals("OR ")) { //same
token.progress(1);
Node rhs = parseExpression();
return new Or(lhs, rhs);
}
if (curr.equals(")")){
rightCount++;
token.progress(1);
if (leftCount == rightCount){
return lhs;
}
return parseLogic(lhs);
}
else{
return null;
}
}
private Node parseExpression(){
String curr = token.curr();
if (curr.equals("(")){
leftCount++;
token.progress(1);
return parseExpression();
}
else if (exPatt.matcher(curr).matches()){ // if it matches an expression, go inside. if next is ), close it
Node node = new Expression(curr);
if (token.next().equals("AND ") || token.next().equals("OR ") ||
token.next().equals("(") || token.next().equals(")")){
token.progress(1);
return parseLogic(node);
}
return node;
}
return null;
}
public static void main(String[] args) {
Parser parse = new Parser("amount > 50 AND ((row < 20 OR team > 30) AND (test < 40 OR you < 20)))");
//Node node = parse.mainParser();
Node node = parse.parseExpression();
node.getExpression();
}
}
The code works great on examples such as
amount > 50 OR (amount < 30 AND (test < 20 OR (test > 10 OR left < 10)))
But fails in ones like
amount > 50 AND ((row < 20 OR team > 30) AND (test < 40 OR you < 20)))
Where the handling of the closed parentheses fails, and instead of having the second node be
LHS = OR (row < 20, team > 30)
RHS = OR (test < 40, you <20)
we get
LHS = row < 20
RHS = AND (team > 30 OR (test <40, you <20))
Any pointers would be greatly appreciated.
EDIT: I think I fixed it. Keeping this up incase anyone else faces a recursive descent parser with an AST, as there aren't many SO questions on this topic.
Inside parseLogic
, the and/ors instead of returning just the node, re-enter parseLogic
with the new nodes. This is to basically check if the node is the left of any other logic node. if not, in the end, lhs
is simply returned.