1

I'm trying to build my own evaluator for mathematical expressions in ruby, and before doing that am trying to implement a parser to break the expression into a tree(of arrays). It correctly breaks down expressions with parenthesis, but I am having lots of trouble trying to figure out how to make it correctly break up an expression with operator precedence for addition.

Right now, a string like 1+2*3+4 becomes 1+[2*[3+4]] instead of 1+[2*3]+4. I'm trying to do the simplest solution possible.

Here is my code:

@d = 0
@error = false
#manipulate an array by reference
def calc_expr expr, array
    until @d == expr.length
        c = expr[@d]
        case c 
        when "("
            @d += 1
            array.push calc_expr(expr, Array.new)
        when ")"
            @d += 1
            return array
        when /[\*\/]/
            @d +=1
            array.push c
        when /[\+\-]/
            @d+=1
            array.push c
        when /[0-9]/
            x = 0
            matched = false
            expr[@d]
            until matched == true
                y = expr.match(/[0-9]+/,@d).to_s
                case expr[@d+x]
                when /[0-9]/
                    x+=1
                else matched = true
                end
            end
            array.push expr[@d,x].to_i
            @d +=(x)
        else 
            unless @error
                @error = true
                puts "Problem evaluating expression at index:#{@d}"
                puts "Char '#{expr[@d]}' not recognized"
            end
            return
        end
    end

    return array
end
@expression = ("(34+45)+(34+67)").gsub(" ","")
evaluated = calc @expression
puts evaluated.inspect
theideasmith
  • 2,835
  • 2
  • 13
  • 20
  • I highly recommend [Treetop](http://cjheath.github.io/treetop/) for setting up a parsing expression grammar to represent the math grammar you wish to support. – Phrogz Jan 22 '15 at 16:19
  • Additionally, see [Parsing Complete Mathematical Expressions with PEG.js](http://stackoverflow.com/questions/19390084/parsing-complete-mathematical-expressions-with-peg-js) – Phrogz Jan 22 '15 at 16:23
  • There is even a PEG representation of the JavaScript language, which includes all the math portions with proper associativity: https://github.com/pegjs/pegjs/blob/master/examples/javascript.pegjs#L735 – Phrogz Jan 22 '15 at 16:47
  • 1
    I got a simple implementation working at: https://gist.github.com/aclinnovator/1437056fcfbe2dc7e3c4 – theideasmith Jan 23 '15 at 02:50

2 Answers2

3

For fun, here's a fun regex-based 'parser' that uses the nice "inside-out" approach suggested by @DavidLjungMadison. It performs simple "a*b" multiplication and division first, followed by "a+b" addition and subtraction, and then unwraps any number left in parenthesis (a), and then starts over.

For simplicity in the regex I've only chosen to support integers; expanding each -?\d+ to something more robust, and replacing the .to_i with .to_f would allow it to work with floating point values.

module Math
  def self.eval( expr )
    expr = expr.dup
    go = true
    while go
      go = false
      go = true while expr.sub!(/(-?\d+)\s*([*\/])\s*(-?\d+)/) do
        m,op,n = $1.to_i, $2, $3.to_i
        op=="*" ? m*n : m/n
      end
      go = true while expr.sub!(/(-?\d+)\s*([+-])\s*(-?\d+)/) do
        a,op,b = $1.to_i, $2, $3.to_i
        op=="+" ? a+b : a-b
      end
      go = true while expr.gsub!(/\(\s*(-?\d+)\s*\)/,'\1')
    end
    expr.to_i
  end
end

And here's a bit of testing for it:

tests = {
  "1"                            => 1,
  "1+1"                          => 2,
  "1 + 1"                        => 2,
  "1 - 1"                        => 0,
  "-1"                           => -1,
  "1 + -1"                       => 0,
  "1 - -1"                       => 2,
  "2*3+1"                        => 7,
  "1+2*3"                        => 7,
  "(1+2)*3"                      => 9,
  "(2+(3-4) *3 ) * -6 * ( 3--4)" => 42,
  "4*6/3*2"                      => 16
}

tests.each do |expr,expected|
  actual = Math.eval expr
  puts [expr.inspect,'=>',actual,'instead of',expected].join(' ') unless actual == expected
end

Note that I use sub! instead of gsub! on the operators in order to survive the last test case. If I had used gsub! then "4*6/3*2" would first be turned into "24/6" and thus result in 4, instead of the correct expansion "24/3*2""8*2"16.

Phrogz
  • 296,393
  • 112
  • 651
  • 745
2

If you really need to do the expression parsing yourself, then you should search for both sides of an expression (such as '2*3') and replace that with either your answer (if you are trying to calculate the answer) or an expression object (such as your tree of arrays, if you want to keep the structure of the expressions and evaluate later). If you do this in the order of precedence, then precedence will be preserved.

As a simplified example, your expression parser should:

  • Repeatedly search for all inner parens: /(([^)+]))/ and replace that with a call to the expression parser of $1 (sorry about the ugly regexp :)

    Now all parens are gone, so you are looking at math operations between numbers and/or expression objects - treat them the same

  • Search for multiplication: /(expr|number)*(expr|number)/ Replace this with either the answer or encapsulate the two expressions in a new expression. Again, depending on whether you need the answer now or if you need the expression tree.

  • Search for addition: ... etc ...

If you are calculating the answer now then this is easy, each call to the expression parser eventually (after necessary recursion) returns a number which you can just replace the original expression with. It's a different story if you want to build the expression tree, and how you deal with a mixture of strings and expression objects so you can run a regexp on it is up to you, you could encode a pointer to the expression object in the string or else replace the entire string at the outside with an array of objects and use something similar to regexp to search the array.

You should also consider dealing with unary operators: "3*+3" (It might simplify things if the very first step you take is to convert all numbers to a simple expression object just containing the number, you might be able to deal with unary operators here, but that can involve tricky situations like "-3++1")

Or just find an expression parsing library as suggested. :)