Building expression parser with Dart petitparser, getting stuck on node visitor

Question

I've got more of my expression parser working (Dart PetitParser to get at AST datastructure created with ExpressionBuilder). It appears to be generating accurate ASTs for floats, parens, power, multiply, divide, add, subtract, unary negative in front of both numbers and expressions. (The nodes are either literal strings, or an object that has a precedence with a List payload that gets walked and concatenated.)

I'm stuck now on visiting the nodes. I have clean access to the top node (thanks to Lukas), but I'm stuck on deciding whether or not to add a paren. For example, in 20+30*40, we don't need parens around 30*40, and the parse tree correctly has the node for this closer to the root so I'll hit it first during traversal. However, I don't seem to have enough data when looking at the 30*40 node to determine if it needs parens before going on to the 20+.. A very similar case would be (20+30)*40, which gets parsed correctly with 20+30 closer to the root, so once again, when visiting the 20+30 node I need to add parens before going on to *40.

This has to be a solved problem, but I never went to compiler school, so I know just enough about ASTs to be dangerous. What "a ha" am I missing?

// rip-common.dart:

import 'package:petitparser/petitparser.dart';
// import 'package:petitparser/debug.dart';

class Node {
  int precedence;
  List<dynamic> args;

  Node([this.precedence = 0, this.args = const []]) {
    // nodeList.add(this);
  }

  @override
  String toString() => 'Node($precedence $args)';

  String visit([int fromPrecedence = -1]) {
    print('=== visiting $this ===');
    var buf = StringBuffer();

    var parens = (precedence > 0) &&
        (fromPrecedence > 0) &&
        (precedence < fromPrecedence);
    print('<$fromPrecedence $precedence $parens>');

    // for debugging:
    var curlyOpen = '';
    var curlyClose = '';

    buf.write(parens ? '(' : curlyOpen);

    for (var arg in args) {
      if (arg is Node) {
        buf.write(arg.visit(precedence));
      } else if (arg is String) {
        buf.write(arg);
      } else {
        print('not Node or String: $arg');
        buf.write('$arg');
      }
    }

    buf.write(parens ? ')' : curlyClose);
    print('$buf for buf');
    return '$buf';
  }
}

class RIPParser {
  Parser _make_parser() {
    final builder = ExpressionBuilder();

    var number = char('-').optional() &
        digit().plus() &
        (char('.') & digit().plus()).optional();

    // precedence 5
    builder.group()
      ..primitive(number.flatten().map((a) => Node(0, [a])))
      ..wrapper(char('('), char(')'), (l, a, r) => Node(0, [a]));

    // negation is a prefix operator
    // precedence 4
    builder.group()..prefix(char('-').trim(), (op, a) => Node(4, [op, a]));

    // power is right-associative
    // precedence 3
    builder.group()..right(char('^').trim(), (a, op, b) => Node(3, [a, op, b]));

    // multiplication and addition are left-associative
    // precedence 2
    builder.group()
      ..left(char('*').trim(), (a, op, b) => Node(2, [a, op, b]))
      ..left(char('/').trim(), (a, op, b) => Node(2, [a, op, b]));
    // precedence 1
    builder.group()
      ..left(char('+').trim(), (a, op, b) => Node(1, [a, op, b]))
      ..left(char('-').trim(), (a, op, b) => Node(1, [a, op, b]));

    final parser = builder.build().end();

    return parser;
  }

  Result _result(String input) {
    var parser = _make_parser(); // eventually cache
    var result = parser.parse(input);

    return result;
  }

  String parse(String input) {
    var result = _result(input);
    if (result.isFailure) {
      return result.message;
    } else {
      print('result.value = ${result.value}');
      return '$result';
    }
  }

  String visit(String input) {
    var result = _result(input);
    var top_node = result.value; // result.isFailure ...
    return top_node.visit();
  }
}

// rip_cmd_example.dart
import 'dart:io';

import 'package:rip_common/rip_common.dart';

void main() {
  print('start');
  String input;
  while (true) {
    input = stdin.readLineSync();
    if (input.isEmpty) {
      break;
    }
    print(RIPParser().parse(input));
    print(RIPParser().visit(input));
  }
  ;
  print('done');
}

I actually solved it by carrying an annotation of the enclosing node, which informs whether parens are needed or not. — Randal Schwartz, Jan 11 '22 at 18:36

Lukas Renggli · Answer 1 · 2020-11-08T22:55:24.943

As you've observed, the ExpressionBuilder already assembles the tree in the right precedence order based on the operator groups you've specified.

This also happens for the wrapping parens node created here: ..wrapper(char('('), char(')'), (l, a, r) => Node(0, [a])). If I test for this node, I get back the input string for your example expressions: var parens = precedence == 0 && args.length == 1 && args[0] is Node;.

Unless I am missing something, there should be no reason for you to track the precedence manually. I would also recommend that you create different node classes for the different operators: ValueNode, ParensNode, NegNode, PowNode, MulNode, ... A bit verbose, but much easier to understand what is going on, if each of them can just visit (print, evaluate, optimize, ...) itself.

I considered that as well, but it's hard to name things. :) The parse tree I get back is properly precedence-and-association-level aware. What I'm lost about is how to tech the visitor that "3-(4-5)" needs those parens, but "3+(4+5)" doesn't. — Randal Schwartz, Nov 09 '20 at 16:55

Building expression parser with Dart petitparser, getting stuck on node visitor

1 Answers1