0

I have 2 tables - one with 1 million records, and the other with 40000 records.

I need to compare for each record in a table if there's a similar string on the other table.

the thing is that this procedure is very slow

I need optimize this procedure

for tablea in ( select first_name||' '||last_name as fullname from employee ) loop

SELECT COUNT(*) INTO num_coincidencias FROM table b WHERE utl_match.jaro_winkler_similarity(b.name ,tablea .fullname) > 98

dbms_output.put_line(num_coincidencias); end loop;

1 Answers1

0

You do realize you are doing 40 billion comparisons? This is going to take a long time no matter what method you use.

Turning this into a SQL statement will eliminate context switches, I don't know if your computer has the resources to do it all in a single SQL statement:

  SELECT COUNT (*) c, a.first_name || ' ' || a.last_name full_name
    FROM employee a CROSS JOIN tableb b
   WHERE UTL_MATCH.jaro_winkler_similarity (b.first_name, a.first_name || a.last_name) > 98
GROUP BY a.first_name || ' ' || a.last_name
Brian Leach
  • 2,025
  • 1
  • 11
  • 14