0

I'm having trouble resolving something that I feel like should be trivial. I modified PLY's calc example to include some function calls, and an argument list that can accept one or more expressions (an expression is a name, number, or literal). The problem is that my implementation is nondeterministic, where sometimes a multi-line argument correctly and other times it doesn't.

For example, I'm typing this in when it's running:

currentDoc(2,5)

Sometimes if I run it, it will produce this:

calc > currentDoc(2,5)
p_expression_number
p_expression_number
p_expression_function
its in there: [2, 5]

And other times I run the program, it will produce this (which I want to avoid):

calc > currentDoc(2,5)
Syntax error at ','
p_expression_number

Both times the parser is going into the right rule (p_expression_number), but it's choosing the wrong thing sometimes. Half the time it seems like it just reduces to the expression (ignoring the comma) and then complains that it doesn't understand the next argument, which is a comma. Other times it's fine.

How do I resolve this? I've tried several things and I've looked up examples but I can't figure it out.

Here's my code:

tokens = ( 'NAME','NUMBER', 'EQUALS', 'COMMA', 'DOT', 'LITERAL', 'LPAREN','RPAREN', )

t_EQUALS  = r'='
t_LPAREN  = r'\('
t_RPAREN  = r'\)'
t_COMMA   = r','
t_DOT     = r'.'
t_LITERAL = r'(\'[^\']*\'|"[^"]*")'
t_NAME    = r'[a-zA-Z_][a-zA-Z0-9_]*'

functions = {
    'currentDoc',
}

def t_NUMBER(t):
    r'\d+'
    try:
        t.value = int(t.value)
    except ValueError:
        print("Integer value too large %d", t.value)
        t.value = 0
    return t

t_ignore = " \t"

def t_newline(t):
    r'\n+'
    t.lexer.lineno += t.value.count("\n")

def t_error(t):
    print("Illegal character '%s'" % t.value[0])
    t.lexer.skip(1)

import ply.lex as lex
lex.lex()

names = {}

def p_statement_assign(t):
    'statement : NAME EQUALS expression'
    print("p_statement_assign")
    names[t[1]] = t[3]

def p_statement_expr(t):
    'statement : expression'
    print("p_statement_expr")
    print(t[1])

def p_parameters(t):
    '''
    parameters : expression
        | parameters COMMA expression
    '''
    if len(t) == 2:
        t[0] = [t[1]]
    else:
        t[0] = t[1]
        t[0].append(t[3])

def p_expression_function(t):
    '''
    statement : NAME LPAREN parameters RPAREN
    statement : NAME DOT NAME LPAREN parameters RPAREN
    '''
    print("p_expression_function")
    if t[2] is ".":
        print('tis dot')
    if t[1] in functions:
        print("its in there:", t[3])
    else:
        print("Function '%s' not defined" % t[1])

def p_expression_literal(t):
    'expression : LITERAL'
    print("p_literal")
    t[0] = t[1][1:-1]

def p_expression_number(t):
    'expression : NUMBER'
    print("p_expression_number")
    t[0] = t[1]

def p_expression_name(t):
    'expression : NAME'
    print("p_expression_name")
    try:
        t[0] = names[t[1]]
    except LookupError:
        print("Undefined name '%s'" % t[1])
        t[0] = 0

def p_error(t):
    if t:
        print("Syntax error at '%s'" % t.value)

import ply.yacc as yacc
yacc.yacc()

while 1:
    try: s = input('calc > ')
    except EOFError:
        break
    yacc.parse(s)
risto
  • 1,274
  • 9
  • 11
  • 1
    I couldn't reproduce the error. I did notice, however, that t_DOT should be '\.', or else it would match everything. – swstephe Feb 10 '14 at 18:24
  • Thanks, that worked! it turns out that the t_DOT was matching everything like you said. – risto Feb 24 '14 at 04:02

0 Answers0