The UTL_MATCH package facilitates matching two records. This is typically used to match names, such as two First Names or two Last Names.
From documentation,
"Edit Distance" also known as "Levenshtein Distance "(named after the Russian scientist Vladimir Levenshtein, who devised the algorithm in 1965), is a measure of Similarity between two strings, s1 and s2. The distance is the number of insertions, deletions or substitutions required to transform s1 to s2.�
The Edit Distance between strings "shackleford" and "shackelford" = 2
The "Jaro-Winkler algorithm" is another way of calculating Edit distance between two strings. This method, developed at the U.S. Census, is a String Comparator measure that gives values of partial agreement between two strings. The string comparator accounts for length of strings and partially accounts for typical human errors made in alphanumeric strings.
For example,
Comparison between normalized values returned by Jaro-Winkler and Edit Distance algorithms
For example,
String 1 String 2 Jaro Winkler Edit Distance
-------- -------- ------------ -------------
Dunningham Cunnigham 89 80
Abroms Abrams 92 83
Lampley Campley 90 86
Summary of UTL_MATCH Subprograms
EDIT_DISTANCE Function
Calculates the number of changes required to transform string-1 into string-2
EDIT_DISTANCE_SIMILARITY Function
Calculates the number of changes required to transform string-1 into string-2, returning a value between 0 (no match) and 100 (perfect match)
JARO_WINKLER Function
Calculates the measure of agreement between string-1 and string-2
JARO_WINKLER_SIMILARITY Function
Calculates the measure of agreement between string-1 and string-2, returning a value between 0 (no match) and 100 (perfect match)