0

My question is very similar to this one: removing all the rows from a table with columns A and B, where some records include non-numeric characters (looking like '1234#5' or '1bbbb'). However, the solutions I read around don't seem to work for me. For example,

SELECT count(*) FROM tbl  
--962060;


SELECT count(*)
FROM tbl
WHERE (REGEXP_like(A,'[^0-9]') OR REGEXP_like(B,'[^0-9]') ) ;
--17


SELECT count(*)
FROM tbl
WHERE (REGEXP_like(A,'[0-9]') and REGEXP_like(B,'[0-9]') )
;
--962060

From the 3rd query, I'd expect to see (962060-17)=962043. Why is it still 962060? An alternative query like this also gives the same answer:

SELECT count(*)
FROM tbl
WHERE (REGEXP_like(A,'[[:digit:]]')and REGEXP_like(B,'[[:digit:]]') )
;
--962060

Of course, I could bypass the problem by doing query1 minus query2, but I'd like to learn how to do that using regular expressions.

Community
  • 1
  • 1
Nonancourt
  • 559
  • 2
  • 10
  • 21

2 Answers2

4

If you use regexp you should take in account that any part of string may be matched as regexp. According your example you should specify that whole string should cntain only numbers ^ - is the beginig of string $ - is the end. And you may use \d- is digits

 SELECT count(*)
 FROM tbl
  WHERE (REGEXP_like(A,'^[0-9]+$') and REGEXP_like(B,'^[0-9]+$') )

or

 SELECT count(*)
 FROM tbl
  WHERE (REGEXP_like(A,'^\d+$') and REGEXP_like(B,'^\d+$') )
Michael Piankov
  • 1,989
  • 1
  • 8
  • 19
1

I know you specifically asked for a regex solution, but translate can solve these kind of questions as well (and usually faster because regexes use more processing power):

select count(1)
from tbl
where translate(a, 'x0123456789', 'x') is null
and translate(b, 'x0123456789', 'x') is null;

What this does: translate the characters 0123456789 to null, and if the result is null, then the input must have been all digits. The 'x' is just there because the third argument to translate can not be null.

Thought I should add this here, might be helpful to other readers.

Martin Schapendonk
  • 12,893
  • 3
  • 19
  • 24
  • Definitely faster! Thanks for adding this. – Nonancourt Oct 19 '16 at 13:20
  • However, from the [definition](https://www.techonthenet.com/oracle/functions/translate.php) of TRANSLATE I can't understand why `select TRANSLATE('5468127', 'x0123456789', 'x') from dual ;` should be NULL instead of 'xxxxxxx'. Also because `select TRANSLATE('x', 'x', 'x') from dual ;` is 'x' and not NULL. Could you please explain? – Nonancourt Oct 28 '16 at 10:46
  • @Nonancourt I expanded my answer a bit. – Martin Schapendonk Oct 28 '16 at 12:05
  • Oh I see! In other words, the missing characters in the 3rd argument of `translate` send the digits to `null`... Thanks! – Nonancourt Oct 28 '16 at 13:46
  • Yes. You would think that `translate('123', '0123456789', null)` would do, but since 3rd argument can not be null, the trick is to add an 'x' that gets translated to an 'x' (i.e. in effect no translation). – Martin Schapendonk Oct 28 '16 at 14:00