Traverse Context Free Grammar

Question

I am facing a problem of traversing a CFG used in prolog environment , to make it traverse in a postorder way. Following is the grammar used -

list_ast(Ls, AST) :- phrase(expression(AST), Ls).

expression(E)       --> term(T), expression_r(T, E).

expression_r(E0, E) --> [+], term(T), expression_r(E0+T, E).
expression_r(E0, E) --> [-], term(T), expression_r(E0-T, E).
expression_r(E, E)  --> [].

term(T)       --> power(P), term_r(P, T).
term_r(T0, T) --> [*], power(P), term_r(T0*P, T).
term_r(T0, T) --> [/], power(P), term_r(T0/P, T).
term_r(T, T)  --> [].

power(P)          --> factor(F), power_r(F, P).
power_r(P0, P0^P) --> [^], factor(P1), power_r(P1, P).
power_r(P, P)     --> [].

factor(N) --> [N], { number(N) }.
factor(E) --> ['('], expression(E), [')'].

When this grammar is execute following output is produced -

?- list_ast([2,+,4,*,3], X).
X = 2+4*3 .

How can i change the grammar so that it can be traversed in POST-ORDER , for accepting expressions such as - ?- list_ast([2,4,3,*,+], X).

This is probably easier than you expect. In post-fix, you don't have to deal with operator precedence, and so you don't have to deal with parenthesis, either. You need a stack that pops its top elements when you encounter an operator. You should at least attempt a solution on your own and add it to the question if you can't seem to make it work. See this: http://stackoverflow.com/questions/15946926/prolog-recursive-return-values-are-dissapearing/15947810#15947810 for a solution to a similar problem. — , Oct 24 '14 at 08:22

Daniel Lyons · Answer 1 · 2014-10-25T04:45:18.643

I probably should have let you work it out on your own but I remember struggling with these kinds of things so I figure it might be helpful to others.

Edit please see Wouter's comments about my solution: it does not work for non-commutative operations like subtraction and division.

First, I want to translate from postfix to infix because that seems like more fun to me. Also then I can just ask Prolog to evaluate it, I don't have to build a stack explicitly to do the evaluation. I consider this one of the miracles of Prolog, that you can manipulate arithmetic expressions like this without them collapsing into values.

In fact, since we just want to parse from right to left I'm just going to flip it around to a Polish notation list and parse that using the sequence itself as the stack.

postfix_to_infix(Postfix, Infix) :-
    reverse(Postfix, Prefix),
    prefix_to_infix(Prefix, Infix).

Converting from prefix to infix is not that bad, the trick is threading consumed list around, so we'll need another argument for that. Note that this kind of threading is exactly what DCGs do, so whenever I notice myself doing a lot of this I think "gee, I could probably do this easier with DCGs." On the other hand, of the two helper predicates only one does this threading explicitly, so it might not help that much. Exercise for the reader I suppose.

We're using univ =.. to build a Prolog term on the way out. Evaluation comes later.

prefix_to_infix(Seq, Expr) :-
    prefix_to_infix(Seq, Expr, []).

% base case: peel off a number
prefix_to_infix([Val|Xs], Val, Xs) :- number(Val).

% inductive case
prefix_to_infix([Op|Rest], Expr, Remainder) :-
    atom(Op),
    % threading Rest -> Rem1 -> Remainder
    prefix_to_infix(Rest, Left, Rem1),
    prefix_to_infix(Rem1, Right, Remainder),
    Expr =.. [Op, Left, Right].

Let's see it in action a few times along with evaluation:

?- postfix_to_infix([2,4,3,7,*,+,*],Term), Res is Term.
Term = (7*3+4)*2,
Res = 50 ;
false.

Let's permute the list by moving the operators and literals around in meaning-preserving ways, just to make sure that the parse isn't doing anything completely stupid.

?- postfix_to_infix([2,3,7,*,4,+,*],Term), Res is Term.
Term = (4+7*3)*2,
Res = 50 ;
false.

?- postfix_to_infix([3,7,*,4,+,2,*],Term), Res is Term.
Term = 2* (4+7*3),
Res = 50 ;
false.

Now let's make sure it properly fails when we have too much or too little.

?- postfix_to_infix([3,7,*,4,+,2,*,3],Term), Res is Term.
false.

?- postfix_to_infix([3,7,*,4,+,2,*,+],Term), Res is Term.
false.

?- postfix_to_infix([7,*], Term), Res is Term.
false.

Looks like it works to me.

Hope this helps!

Wouter Beek · Answer 2 · 2014-10-25T06:54:17.820

My implementation differs from Daniel's in the following ways:

Works for non-commutative operators such as - and //. I believe that postfix notation is not quite the reverse of Polish notation. E.g., - 1 2 in Polish notation corresponds to 1 2 - in postfix notation, not 2 1 -.
Works for unary operators. Operators with arity >2 do not occur in Prolog unfortunately, but the implementation would be able to handle those as well.

Code

RPN stands for Reverse Polish Notation, also known as Postfix Notation.

%! rpn(+Notation:list(atomic), -Outcome:number) is det.

rpn(Notation, Outcome):-
  rpn(Notation, [], Outcome).

rpn([], [Outcome], Outcome):-
  number(Outcome).
% Push operands onto the stack.
rpn([Operand|Notation], Stack, Outcome):-
  number(Operand), !,
  rpn(Notation, [Operand|Stack], Outcome).
% Evaluate n-ary operators w.r.t. the top n operands on the stack.
rpn([Op|Notation], Stack, Outcome):-
  % Notice that there can be multiple operators with the same name.
  current_op(_, OpType, Op),
  op_type_arity(OpType, OpArity),

  % Select the appropriate operands.
  length(OperandsRev, OpArity),
  append(OperandsRev, NewStack, Stack),

  % Apply the operator to its operands.
  reverse(OperandsRev, Operands),
  Expression =.. [Op|Operands],
  Result is Expression,

  rpn(Notation, [Result|NewStack], Outcome).

op_type_arity(fx,  1).
op_type_arity(fy,  1).
op_type_arity(xf,  1).
op_type_arity(xfx, 2).
op_type_arity(xfy, 2).
op_type_arity(yf,  1).
op_type_arity(yfx, 2).

Example of use

?- rpn([5,1,2,+,4,*,+,3,-], X).
X = 14.

Afterthoughts

I particularly liked Daniel's use of is/2 to evaluate the outcome, so that his main task was conversion to infix notation. My implementation also uses the current operator declarations (i.e., op/3), not by using is/2 by current_op/3 instead.

Since Prolog defines multiple operators with the same name, my approach may give ambiguous results:

?- rpn([1,2,+,-], X).
X = -1 ;
X = -3 ;
false.

Another example: the following fails in Daniel's algorithm:

?- rpn([3,7,*,4,+,2,*,+], X).
X = 29 ;
X = 50 ;
false.

This ambiguity is probably not allowed in 'official' postfix notation (although I like it). It is easily restricted by taking only the largest arity that occurs for a given operator name.

Thanks for the catch and wow! I completely forgot about `current_op/3`, this is extremely cool. — Daniel Lyons, Oct 25 '14 at 04:44

Traverse Context Free Grammar

2 Answers2

Code

Example of use

Afterthoughts