4

I have a simple google sheets REGEXEXTRACT function that extracts numbers from NFL lines and over unders.

Here's the problem: The function

=VALUE(REGEXEXTRACT(I3,"-*\d*.?\d+"))

properly extracts -13.5 from text Line: GB -13.5 But when I use the same function on the text O/U: 51.5, it incorrectly extracts 51.0

Where is my regular expression failing me?

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • If you are willing to change your formula you could probably also use [this very short version](https://stackoverflow.com/a/65782427/1527780). – marikamitsos Jan 18 '21 at 21:44

4 Answers4

6

The problem with your current regex is that, as written, in the second case the leading portion of the pattern is not matching anything, and only the final \d+ matches the integer 51 (see the demo here). You see 51.0 in your Excel spreadsheet because the decimal component defaults to being zero. Please use this regex pattern instead:

-?\d+(?:\.\d+)?

Demo

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
3

I'll throw another approach into the ring:

=1*REGEXEXTRACT(I3,"[\d.-]+")

[\d.-]+ reads "Any digit or period or hyphen in any unbroken combination of one or more such characters."

The 1* is another method of turning a number-like string into a true number.

Erik Tyler
  • 9,079
  • 2
  • 7
  • 11
  • My preferred solution. For added compatibility use [\d.,-]+ to include , separated thousands. wrap in VALUE() to convert properly to a number – babelmonk Dec 30 '22 at 07:37
1

What about:

=--REGEXEXTRACT(I3,"-?\d+(?:\.\d+)?")

The pattern means:

  • -? - Optional hypen.
  • \d+ - 1+ digits.
  • (?: - Open non-capture group.
    • \.\d+ - A literal dot and 1+ digitis.
    • )? - Close non-capture group and make it optional.

Instead of VALUE() I used the double negative to make use of the ability of GS to turn a string that looks like a number into a number through direct calculations. The first hyphen creates a negative number which we counter with the 2nd. The same result can be achieved through 1*.

JvdV
  • 70,606
  • 8
  • 39
  • 70
  • This worked!! Can you explain the -- in the formula ? – ryanbuckner Jan 18 '21 at 15:49
  • 1
    I'm learning a lot from this. Did you mean to put a question mark on the end of your pattern to make the decimal part optional though? It looks like it doesn't work on integers as shown, should they ever occur. – Tom Sharpe Jan 18 '21 at 16:36
  • I edited the pattern as indeed it needed the question mark. Strange how I had it all ready included in my explanation underneath it. @TomSharpe. And the whole point was to have the optional decimal point in case of integers as you suggest. Thank for the extra eyes here. – JvdV Jan 18 '21 at 16:40
0

I believe the shortest one would be

=REGEXEXTRACT(I2," .*")+0

enter image description here

Which means:
return everything after the space and turn it to a number

marikamitsos
  • 10,264
  • 20
  • 26