0

I need to redact proper names from text fields in SQL Server. Let's say I have the following table:

PersonTable
    FirstName
    LastName
    Notes

I could do this:

UPDATE PersonTable
SET Notes = REPLACE(REPLACE(Notes, FirstName, 'REDACTED'), LastName, 'REDACTED')

That should work fine for the exact match condition, but what if someone has misspelled first or last name in the Notes field, or worse yet, used a nick-name like Jim?

I think Full Text searching using Contains is good for this sort of thing where the deviation is meaning or derivation-based, but will it work for names? Even if it worked for finding rows where Notes contained a name, I don't think it works with the Replace scenario.

I have also considered SOUNDEX, but I am also not seeing how to do this using Replace for a text field. The only way I can see using Soundex or something like that would be to split the text field into words and do a comparison on each word. I have to do this on many text fields in very heavily populated tables, so I'm not excited about doing that if there's a better way.

Does anyone have experience doing something like this?

Thanks

sydneyos
  • 4,527
  • 6
  • 36
  • 53
  • Replace on space before and space after to get whole words or will get things like wordRedacted for worksmith. What good is contains inflection and synonymy on proper names? What about proper names from other rows? For soundex you need the individual word from Notes. Question # 666. This needs more than TSQL and more than one approach. What is the sensitivity to a name getting out? – paparazzo Jan 01 '13 at 14:14
  • @Blam, this is a good observation but requires some thought. For instance, it works okay for words that are indeed surrounded by spaces. But for words at the beginning of a string, or followed by punctuation, take care. – sydneyos Jan 09 '13 at 02:18

0 Answers0