3

I'm trying to create my own analyser/parser.

I have a problem which I understand why it doesn't work but I'm unsure of how to solve it.

This is the code for the problem part of my parser.

void Expression() : {}{
    Term() ((<PLUS> | <MINUS>) Term())*
}

void Term() : {}{
     Factor()((<MULTIPLY> | <DIVIDE>) Factor())*
}

void Factor() : {}{
    (<ID> | <NUMBER> | ((<PLUS> | <MINUS>)?<OPEN_PARENTHESIS> Expression() <CLOSE_PARENTHESIS>))
}

void Condition() : {}{
    (
        (<NOT> Condition()) |
        (<OPEN_PARENTHESIS> Condition() (<AND> | <OR>) Condition() <CLOSE_PARENTHESIS>) |
        (Expression() (<EQUAL_CHECK> | <NOT_EQUAL> | <LESS> | <LESS_EQUAL> | <GREATER> | <GREATER_EQUAL>) Expression())     
    )
}

As you can see, the problem comes within the Condition() method from the last two of the three options in the OR section. This is because Expression() can eventually become "( Expression() )" therefore both the third and second option can begin with a open parenthesis token.

However, I'm unsure how I would solve this problem. I solved a similar problem earlier in the parser however I can't employ the same logic here without it being extremely messy because of the way Expression() --> Term() --> Factor() and the problem code being all the way down in the Factor() method.

Any advice would be greatly appreciated.

Thanks,

Thomas.

EDIT:

For more info, I'll provide to code examples that should work with this parser but will not due to the bug explained above.

fun succesful_method()
    start
        var i = 1;
        if(i > 0 and i < 2)
        do
            i = 2;
        stop
    stop

start
    successful_method()
stop

The above method would run successfully as it uses the second alternative of the Condition() method.

fun succesful_method()
    start
        var i = 1;
        if(i > 0)
        do
            i = 2;
        stop
    stop

start
    successful_method()
stop

The above method would fail, as it requires use of the third alternative, however it cannot access this due to the '(' causing the parser to call the second alternative.

jww
  • 97,681
  • 90
  • 411
  • 885
Feint
  • 67
  • 7
  • Many languages (e.g. Java, C, C++, Python, Haskell, ML) take the approach of not syntactically distinguishing between conditions and expressions. One reason for this is that Boolean variables are really useful. Once you have Boolean variables you'll want to have if statements like this `if q do ... stop` and assignment statements like this `q = a < b ;`. At this point keeping Boolean expressions and other expressions syntactically separate gets to be futile. So you should give some consideration to @1010's answer. – Theodore Norvell May 01 '15 at 00:24

2 Answers2

1

Using a single grammar for all expressions and defining precedence for all operators should solve your problem, at the expense of adding semantic checks for the type of expressions.

Expr -> AndExpr (<OR> AndExpr)*
AndExpr -> NotExpr (<AND> NotExpr)*
NotExpr -> <NOT>* RelExpr
RelExpr -> NumExpr () (<RELOP> NumExpr)?

NumExpr -> Term ((<PLUS>|<MINUS>) Term)*
Term -> Factor ((<MULTIPLY>|<DIVIDE>) Factor)*
Factor -> (<PLUS>|<MINUS>)* Atom
Atom -> <ID> | <NUMBER> | <OPEN_PARENTHESIS> Expr <CLOSE_PARENTHESIS>

The token <RELOP>represents your relational operators.

Note that this grammar let's you mix boolean and numerical expressions, so you should check for errors.

For example for Expr -> AndExpr the type returned would be the type of AndExpr. But for AndExpr <OR> AndExpr you should check that both AndExprs are boolean expressions and the type returned by Expr would be Boolean.

1010
  • 1,779
  • 17
  • 27
  • Interesting solution! Thanks a lot. Is there an advantage to this solution over the "Look Ahead" solution provided above in terms of efficiency? – Feint Apr 30 '15 at 23:35
  • as @TheodoreNorvell pointed out, this way it would be easier to add boolean variabes to your language. I don't think you'll notice a difference in performance parsing normal programs. – 1010 May 01 '15 at 01:41
  • You can make this approach highly efficient by using precedence climbing. http://www.engr.mun.ca/~theo/Misc/exp_parsing.htm – Theodore Norvell May 01 '15 at 10:36
1

You can solve this with syntactic look ahead.

void CompOp() : {} { <EQUAL_CHECK> | <NOT_EQUAL> | <LESS> | <LESS_EQUAL> | <GREATER> | <GREATER_EQUAL> }

void Condition() : {}{
        <NOT> Condition()
    |
        LOOKAHEAD(Expression() CompOp()) 
        Expression()
        CompOp()
        Expression()   
    |
        <OPEN_PARENTHESIS>
        Condition()
        (<AND> | <OR>)
        Condition()
        <CLOSE_PARENTHESIS>
}

Slightly more efficient is to only lookahead when there is a (.

void Condition() : {}{
        <NOT> Condition()
    |   LOOKAHEAD( <OPEN_PARENTHESIS> )
        (
            LOOKAHEAD(Expression() CompOp()) 
            Expression()
            CompOp()
            Expression() 
        |
            <OPEN_PARENTHESIS>
            Condition()
            (<AND> | <OR>)
            Condition()
            <CLOSE_PARENTHESIS>
        )
    |
        Expression()
        CompOp()
        Expression()   
}
Theodore Norvell
  • 15,366
  • 6
  • 31
  • 45