2

We have a table in Hana with temperature data. Years ago the genius decision was made in our old database to make the temperature field String since temps were added manually and they could use it to add exception codes and text when a temp was bad or couldn't be taken.

Now I'm trying to extract only the rows with valid temps (some form of decimal or integer) so I can cast the temps as decimal and do analysis. Using regex, I can filter out all non-numeric fields...except those like this:

52.3.

I'm currently using this expression:

/^[+-]?((\d+(\.\d*)?)|(\.\d+))$/

There are a lot of weird decimals formats this does catch, but not numbers with an additional, separated period at the end.

They're not going to fix the data, even if they did it would take them forever to get around to it. So I need a new expression to handle these. Hoping someone has an idea because my google-fu has failed me so far.

Sandra Rossi
  • 11,934
  • 5
  • 22
  • 48
  • Wanted to add that the method we're currently using in the sql is to replace all numeric characters with '' and then test if the length of the resulting string is equal to 0. So, we'd expect the second . to not be removed, the length to be 1, and the record to be ignored. – John Victor Aug 02 '22 at 16:22

4 Answers4

0

Here is a short and appropriate solution:

const getTemperature = str => /^[+-]?\d+\.\d+\.?$/.test(str) ? str.match(/^[+-]?\d+\.\d+\.?$/) : undefined;

console.log(getTemperature("+52.3")); // ["+52.3"]
console.log(getTemperature("-52.3.")); // ["-52.3."]
console.log(getTemperature("52.3")); // ["52.3"]
console.log(getTemperature("52.3.")); // ["52.3."]

Good luck !

Mrwaks
  • 160
  • 5
0

You could use this:

^[+-]?\d+(?:\.\d+)?$
  • \d+ to match 1 or more digits
  • (?:\.\d+)? Non-capturing group with a decimal separator followed by 1 or more digits. So it can either exist or not, ensures only one decimal separator

Regex 101 link

So it matches:

52.3
42
-6
52.0
0.61
+0.6

but doesn't match:

52.
test
192.0.0.1
-6.0.1
+.8
++6
-6-
-3.
degant
  • 4,861
  • 1
  • 17
  • 29
0

That's a pretty sensible looking regex, it's a good start. Kudos for using ^ $ anchors at front and back.

What you have to understand is that in a regex . is quite different from \. -- the 1st matches any character while the 2nd matches just a literal . dot character.

So you'll want

/^[+-]?((\d+(.\d*)?)|(.\d+))$/

to become

/^[+-]?((\d+(\.\d*)?)|(\.\d+))$/
J_H
  • 17,926
  • 4
  • 24
  • 44
0

You could write the pattern as

^[+-]?\d*\.?\d+\.?$

Explanation

  • ^ Start of string
  • [+-]? Match an optional - or +
  • \d* Match optional digits
  • \.? Match an optional dot
  • \d+ Match 1+ digits
  • \.? Match an optional dot
  • $ End of sting

Regex demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70