3

First of all, there are lots and lots of questions about regex on math expressions, if i overlooked one that already answers this, sorry. While this stems from parsing a math expression (no eval...), it is more about "can't this be done with a single tricky regex or similar?".

Is there a way to split a string containing a math expression like "-5.42+2--3*-5.5/-2+-4" to ["-5.42", "+", "2", "-", "-3", "*", "-5.5", "/", "-2", "+", "-4"] in a single .split[1]? Else said, split binary operators (/[+*/]|(?<!^|[+\-*/])-/, that lookbehind is the issue) and their arguments (/-?\d+(\.\d+)?/). Unary minus appears at most once per number, aka no ---. There are no braces at this step.

The way i see it, without lookbehind, it is impossible to differentiate the unary - from the binary - with the restraints of split (expected answer). However, maybe there is a trick to get the same result without lookbehind. I got surprised by tricky regex workarounds too often to trust my intuition.

With several operations, here is one way of many ways (note that the first regex replace somewhat emulates a lookbehind):

console.log("-5.42+2--3*-5.5/-2+-4"
  .replace(/^-|([+\-*/])-/g, "$1#")
  .split(/([+\-*/])/)
  .map(e => e.replace("#", "-"))
);

Another alternative would be to reverse the string, then use lookahead instead and reverse the results again.

[1] I would add (or operation) here but the question of what an "operation" is would immediately arise and completely derail the topic. Similarly would (or in a beautiful way) as being opinion based. However, i thought e.g. about using repeated parenthesized matches which is not possible in javascript but would be very similar to split.

ASDFGerte
  • 4,695
  • 6
  • 16
  • 33

2 Answers2

1

Short answer:

You can't do what you want with just Regex. You have to use Parsing expression grammar (PEG)

Long answer

From what you're seeing, a math expression just contains terms and operators. But in fact, it is more complex than that. Because a combination of a term and a particular operator will have higher priority than the others. So what we have to concern are not only term and operator, but something like:

  1. First we declare a term = a number or an expression
  2. Then we have an unary = positive value or negative value of a term
  3. Then we have a multiple = an unary [multiply or divide to another unary]{0 times or more}
  4. Then we have a sum = a multiple [add or subtract to another multiple]{0 times or more}
  5. Finally, an expression = a sum

This is the declaration for a + - * / only expression. A more complex expression which contains parenthese, and operator, or operator etc will have more complex declaration. And you have no choice but using PEG.

So, try the PEGjs here. It has a demo of how simple expressions like "2 * (3 + 4)" are implemented.

And sure you can extract anything you want from a expression.

trgiangvp3
  • 253
  • 3
  • 9
  • This was interesting to read about and i found another solution to the source of this question which was using PEGjs. I was not concerned with prioritizing multiplication over addition or the sorts at this step but just wanted to differentiate the unary `-` from the binary in a single tokenizing step. Looking at it now, combining numbers and the unary `-` in the same step is probably not even a good idea. You focus more on how a complete expression can be parsed utilizing regex vs PEG. It's useful information, but i am unsure whether i should mark it as answer. – ASDFGerte Aug 16 '17 at 23:36
  • Hello ASDFGerte, thanks for your comment. Not sure what your purpose is in splitting the expression. But from your question, I see that there is no way you can do it without expression grammar analysis​. Because your expression will go more and more complex. My example is just an example. Depend on your purpose, your grammar declaration will be varied. But you can't extract anything if you don't clearly understand your expression grammar, even if you using `split` and `regex`. – trgiangvp3 Aug 17 '17 at 00:17
-3

What is wrong with splitting on the operators?

.split(/(?=[-+*/()])/);
NetMage
  • 26,163
  • 3
  • 34
  • 55