The double backslash in \\d
suggests an old version with standard_conforming_strings = off
. The manual:
Beginning in PostgreSQL 9.1, the default is on
(prior releases defaulted to off).
In modern versions with standard_conforming_strings = on
, this string makes little sense as regular expression: '^\\d+$'
. To detect strings consisting of one or more digits use either E'^\\d+$'
(prefixed with E
) or '^\d+$'
. Details:
Integer literals also allow an optional leading sign for negative / positive numbers, and leading / dangling white space in Postgres.
So, this is the complete regular expression for valid integer
literals:
CASE WHEN d.value ~ '^\s*[-+]?\d+\s*$' THEN d.value::int ELSE 0 END
The regular expression explained:
^
.. start of string
\s
.. class shorthand for [[:space:]]
(white space)
*
.. quantifier for 0 or more times
[+-]
.. character class consisting of +
and -
?
.. quantifier for 0 or 1 times
\d
.. class shorthand for [[:digit:]]
(digits)
+
.. quantifier for 1 or more times
\s*
.. same as above
$
.. end of string
Consider the syntax rules for numeric string literals. One essential quote:
There cannot be any spaces or other characters embedded in the constant
That's because a numeric constant is not quoted, hence white space is not possible. Not applicable for casting strings. White space is tolerated:
Leading, trailing and right after the exponent char.
So these are all legal strings for the cast to numeric
:
'^\s*[-+]?\d*\.?\d+(?:[eE]\s*[-+]?\d+)?\s*$'
The only new element are parentheses (()
) to denote the contained regular expression as atom. Since we are not interested in back references, use "non-capturing": (?:
...)
and append a question mark (?:[eE]\s*[-+]?\d+)?
to mean: the "exponential" part can be added or not, as a whole.
Assuming dot (.
) as decimal separator. You might use comma instead (,
) or [,\.]
to allow either. But only dot is legal for the cast.
Test:
SELECT '|' || lit || '|' AS text_with_delim
, lit ~ '^\s*[-+]?\d*\.?\d+([eE]\s*[-+]?\d+)?\s*$' AS valid
, lit::numeric AS number
FROM unnest ('{1
, 123
, 000
, " -1 "
, +2
, 1.2
, .34
, 5e6
, " .5e -6 "
}'::text[]) lit;
Result:
text_with_delim |
valid |
number |
|1| |
t |
1 |
|123| |
t |
123 |
|000| |
t |
0 |
| -1 | |
t |
-1 |
|+2| |
t |
2 |
|1.2| |
t |
1.2 |
|.34| |
t |
0.34 |
|5e6| |
t |
5000000 |
| .5e -6 | |
t |
0.0000005 |
Or you might have use for to_number()
to convert strings of arbitrary given format.