0

I have a column that should contain phone numbers but it contains whatever the user wanted. I need to create an update to remove all the characters after an invalid character.

To do this I am using a regex as PATINDEX('%[^0-9+-/()" "]%', [MobilNr]) and it seemed to work until I had some numbers as +1235, 36446 and to my surprise the result is 0 instead of 6. Also if the number contains . it returns 0.

Does PATINDEX ignores dot(".") and comma(",")? Are there other characters that PATINDEX will ignore?

DineshDB
  • 5,998
  • 7
  • 33
  • 49
Dan
  • 683
  • 2
  • 8
  • 24
  • Not really sure what you're asking here. If you could [edit] your question to include sample data (As DDL+DML!), your current statement and expected results that would be great. – Zohar Peled Nov 07 '19 at 08:12
  • Why does it matter the DDL and DML? Try SELECT PATINDEX('%[^0-9+-/()" "]%', '+1235, 36446') and the result is 0 instead of 6 which is the index of ",". From this I undestand that PATINDEX or Regex ignores dot and comma – Dan Nov 07 '19 at 08:55

1 Answers1

2

It's not that PATINDEX ignores the comma and the dot, it's your pattern that created this problem.

With PATINDEX, the hyphen char (-) has a special meaning - it's in fact an operator that denotes an inclusive range - like 0-9 denotes all digits between 0 and 9 - so when you do +-/ it means all the chars between + and / (inclusive, of course). The comma and dot chars are within this range, that's why you get this result.

Fixing the pattern is easy: either use | as a logical or, or simply move the hyphen to the end of the pattern:

SELECT PATINDEX('%[^0-9/()" "+-]%', '+1235, 36446') -- Result: 6
Zohar Peled
  • 79,642
  • 10
  • 69
  • 121