0

I have a list of names and I want to check, whether a name is present in the names in the table or not. But I have to think also on misspelling the name, so I want to find those names, which are misspelled with PARAMETER number of characters, i.e. the name has +-PARAMETER number of characters which are different, or not present in the list. Even better, if it would be possible to have two different parameters for length and differences. Example: My list contains John Doe, Adam Smith, Maria Thompson. I want to find Merie Bhompson if PARAMETER is 3 or more, but not find it if PARAMETER is 2 or less.I also want to find Mariaqqq Thompson as well as Ma Thompson if my PARAMETER is 3 or more. It should also work if my list contains O Le and PARAMETER is 5 (name is 4 characters long including space). Can you please help me finding a solution?

Szasza
  • 1
  • First step is to correctly tag your question. Is this about a database? SQL? Then use the SOUNDEX function built into the database. – Axel Nov 30 '21 at 19:27
  • I had to do this once. There are well known algorithms for "edit distance" (https://en.wikipedia.org/wiki/Edit_distance), specifically Levenshtein distance. I wrote a function in PL/SQL that I could use with `min()` in a `select` statement. – John Bayko Dec 01 '21 at 00:11
  • Dear Axel, many thanks. Unfortunately Soundex will not solve removing 3 characters or adding 3 characters. Dear John Bayko, many thanks, I wanted to change Levenshtein to something more useful, as this method uses a lot of computing force and not really reliable (at least how the vendor has implemented it is not reliable). – Szasza Dec 01 '21 at 16:50

0 Answers0