Questions tagged [utl-match]

The UTL_MATCH package facilitates matching two records. This is typically used to match names, such as two First Names or two Last Names.

From documentation,

"Edit Distance" also known as "Levenshtein Distance "(named after the Russian scientist Vladimir Levenshtein, who devised the algorithm in 1965), is a measure of Similarity between two strings, s1 and s2. The distance is the number of insertions, deletions or substitutions required to transform s1 to s2.�

The Edit Distance between strings "shackleford" and "shackelford" = 2

The "Jaro-Winkler algorithm" is another way of calculating Edit distance between two strings. This method, developed at the U.S. Census, is a String Comparator measure that gives values of partial agreement between two strings. The string comparator accounts for length of strings and partially accounts for typical human errors made in alphanumeric strings.

For example,

Comparison between normalized values returned by Jaro-Winkler and Edit Distance algorithms

For example,

String 1        String 2    Jaro Winkler    Edit Distance
--------        --------    ------------    -------------
Dunningham      Cunnigham   89              80
Abroms          Abrams      92              83
Lampley         Campley     90              86

Summary of UTL_MATCH Subprograms

EDIT_DISTANCE Function

Calculates the number of changes required to transform string-1 into string-2

EDIT_DISTANCE_SIMILARITY Function

Calculates the number of changes required to transform string-1 into string-2, returning a value between 0 (no match) and 100 (perfect match)

JARO_WINKLER Function

Calculates the measure of agreement between string-1 and string-2

JARO_WINKLER_SIMILARITY Function

Calculates the measure of agreement between string-1 and string-2, returning a value between 0 (no match) and 100 (perfect match)

4 questions
5
votes
2 answers

How to get almost matching string from Oracle table?

I have a table in Oracle with four columns. Now user can enter input string as "operation Knee right" (which is valid) to my query and my query should return the ICD Code (IKR123) which matches most of the word in DiagnosisName column. Following is…
shary.sharath
  • 649
  • 2
  • 14
  • 29
0
votes
1 answer

How can I show the differences beetween 2 columns from different tables using UTL_MATCH in Oracle?

I have two columns In Oracle DB which I want to compare, and if the strings are different, I want to show the differences in other columns it. I know I miss something. So for example: SELECT A.CD_KEY01, A.TEXT_01, B.TEXT_02, …
0
votes
1 answer

utl_match comparing many records

I have 2 tables - one with 1 million records, and the other with 40000 records. I need to compare for each record in a table if there's a similar string on the other table. the thing is that this procedure is very slow I need optimize this procedure…
0
votes
2 answers

Fast data comparing between two tables

I'm using Oracle11g and i would compare two tables finding records that match between them. Example: Table 1 Table 2 George Micheal Michael Paul The record "Micheal" and "Michael" match between them, so they are good…
user2520969
  • 1,389
  • 6
  • 20
  • 30