I am trying to build a CAS in Python and am currently stuck on implementing the parser I will use to go from expression string to an Anytree tree I can eventually manipulate and simplify. The issue seems to be that, when parsing, yacc doesn't implement my GROUP
node with it's proper children nodes as defined in my parser grammar spec. I've tried messing with precedences and associativities, switching up the order of the grammar rules but nothing seems to make it parent the nodes properly. What's even stranger is that in debug/verbose mode, it makes a node for the expression when it pattern-matches to it, but it (for some reason) fails to parent it to the GROUP
node when it recognises an LPAREN expression RPAREN
token
Here's my code:
import ply.yacc as yacc
from anytree import Node, RenderTree
import ply.lex as lex
#init token names
tokens = (
'INTEGER',
'PLUS',
'MINUS',
'TIMES',
'DIVIDE',
'LPAREN',
'RPAREN',
)
#init regex rules for said tokens
t_PLUS = r'\+'
t_MINUS = r'-'
t_TIMES = r'\*'
t_DIVIDE = r'/'
t_LPAREN = r'\('
t_RPAREN = r'\)'
#init regex rule/function for integers
def t_INTEGER(t):
r'\d+'
t.value = int(t.value)
return t
#ignoring whitespace
t_ignore = '\t'
#handling unknown characters (%s inserts whatever is past the second %)
def t_error(t):
print("Illegal character '%s'" %t.value[0] )
t.lexer.skip(1)
#building the lexer
lexer = lex.lex()
#parser grammar spec
#binary operators
def p_expression_binop(p):
'''expression : expression PLUS expression
| expression MINUS expression
| expression TIMES expression
| expression DIVIDE expression '''
p[0] = Node(p[2],children = [Node(p[1]),Node(p[3])])
#brackets/grouping
def p_expression_group(p):
'expression : LPAREN expression RPAREN'
p[0] = Node(p[1], children = [Node(p[2])])
#integers
def p_expression_number(p):
'expression : INTEGER'
p[0] = Node(p[1])
def p_error(p):
print("Input syntax error ")
treeParser = yacc.yacc()
while True:
try:
s = input("Calculate this >")
except EOFError:
break
if not s: break
ParsTree = treeParser.parse(s)
print(RenderTree(ParsTree))
Sample input: (2)+(2)
Sample Output:
Calculate this >(2)+(2)
Node('/+')
├── Node("/+/Node('/GROUP')")
└── Node("/+/Node('/GROUP')")
As you can see, it only creates GROUP
nodes and does not make any child integer nodes under said GROUP
nodes
Edit: Made the code self-contained and adding sample input and output to better explain the problem