1

I have a case where I am getting the data from DB and converting the string to a number using TO_NUMBER, but this case fails when the string is an empty string with unknown or space char like

columnA
------
  4444
      
333333

The string '4444' and '333333' is converted to number by there is and error "ora-01722 invalid number" for the 2nd string.

Can this be handled with DECODE or CAST in any way, because I need to use TO_NUMBER any how for further processing?

2 Answers2

2

I hope this could be Insight of your issue.

select 
    TO_NUMBER(trim(colA)),  
    TO_NUMBER(REGEXP_REPLACE(colA,'(^[[:space:]]*|[[:space:]]*$)')),
    regexp_instr(colA, '[0-9.]')    
from 
(
    select '   123' colA from dual
    union all
    select '   ' colA from dual
    union all
    select '.456' colA from dual
)

This is similar issue : Trim Whitespaces (New Line and Tab space) in a String in Oracle

jornathan
  • 646
  • 5
  • 13
0
  • If all the data within that column is composed of integers, integers with leading and/or trailing whitespaces, null values and only whitespaces then only using TRIM() function will suffice such as
SELECT TRIM(columnA)
  FROM t

and that would be more performant than using functions of regular expressions

But

  • If the data contains decimal numbers, letters, punctiations and special characters along with whitespaces and null values, then use
SELECT TRIM('.' FROM REGEXP_REPLACE(columnA,'[^[:digit:].]'))
  FROM t

where there is at most one dot character assumed to be between the starting and ending digits. All of the leading and trailing dots are trimmed at the end of the operation provided there is any of them. The other characters are already removed by the regular expression.

  • If you're sure that there's no trailing or leading dots, then using
SELECT REGEXP_REPLACE(columnA,'[^[:digit:].]')
  FROM t

would be enough

Demo

You can wrap up any of the expressions with TO_NUMBER() function depending on your case at the end

Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55
  • In `[^[:digit:].[:digit:]]` the second `[:digit:]` is redundant and can be removed as you have already got it once in the character list. – MT0 Nov 25 '21 at 09:14
  • "redundant" depends on the case. For the third case I've used only one `[:digit:]` POSIX but used twice for the second one. Since there might exist leading and/or trailing dots for the second case as it has no restriction for the character sets @MT0 Check out the demo please. – Barbaros Özhan Nov 25 '21 at 11:07
  • See [db<>fiddle](https://dbfiddle.uk/?rdbms=oracle_18&fiddle=e26d14e783c4eff705185bae8cb1f6fb) `[]` is the syntax for a character list and matches a **SINGLE** character; it does not match a sequence of characters. In this case you are matching a single character that is not either a digit or a dot or a digit and you don't need to check for digit twice. – MT0 Nov 25 '21 at 11:11
  • Thank you, good catch! @MT0 I presume you wanted to recommend `TRIM('.' FROM REGEXP_REPLACE(columnA,'[^[:digit:].]'))` for `case_b_simple` rather than the current one. – Barbaros Özhan Nov 25 '21 at 11:58
  • No, I'd be using a `NUMBER` column rather than getting into the situation where you need to parse badly formatted strings; if that isn't possible then I'd be using `REGEXP_SUBSTR(columnA, '\d+(\.\d*)?|\.\d+')` to find the first decimal value and ignore everything else (however the OP's question is not clear on their requirements if there were decimals). [db<>fiddle](https://dbfiddle.uk/?rdbms=oracle_18&fiddle=d97c22f68fd926dbd6efef27a4a26ff8) – MT0 Nov 25 '21 at 12:04