0

I have an Oracle table which contains user comment. I would like to mask all words like Name/Surname/Telephone and Address

Is there a simple way to do this in Oracle?

Example :

"Hello, i'm Alex DURAND, my phone number is 0685987525"
should be transform in
"Hello, i'm XXXX XXXXXX, my phone number is XXXXXXXXXX"

Thanks a lot

Jim Simson
  • 2,774
  • 3
  • 22
  • 30
Flatbeat
  • 123
  • 1
  • 3
  • 13
  • 2
    Is comment's format **always** the same? If not, how will you know that "Alex Durand" (which should be masked) is different from "His name" (which should not be masked)? Also, how do you know that "09012020" is telephone number and not today's date? Shortly - we need *rules*. – Littlefoot Jan 09 '20 at 09:03

2 Answers2

0

When the task is to remove just the words the regexp_replace would be helpful here

select regexp_replace(lower(user_comment), '(surname|name|telephone|address)', '') replace_regexp
  from tab1
ekochergin
  • 4,109
  • 2
  • 12
  • 19
0
  1. Phone number is masked by using regexp_replace() function through '([[:digit:]])' pattern

  2. The Part starting with i'm is extracted by using regexp_substr() function through '(i''m )(.*?)[^,]+' pattern

  3. The name and the surname are extracted from the string derived in the second step through use of regexp_substr() function through '[^ ]+'

  4. At the last step successive use of non-whitespace character matching through regexp_replace() and replace() functions together

    with t2 as
    (
     select regexp_replace( str, '([[:digit:]])', 'X') as str,
            regexp_substr(str,'(I''m )(.*?)[^,]+',1,1,'i') as str2       
       from t --> your original table
    ), t3 as
    (
     select regexp_substr(str2,'[^ ]+',1,2) as name,
            regexp_substr(str2,'[^ ]+',1,3) as surname,
            t2.*
       from t2
     )
    select replace(
                   replace( str, name, regexp_replace(name,'(*\S)','X') )
                  ,surname, regexp_replace(surname,'(*\S)','X') )
           as "Result String"        
      from t3
    

where the format of each values in the str column is assumed to be fixed throughout the table.

Demo

Barbaros Özhan
  • 59,113
  • 10
  • 31
  • 55