I have written the Longest Common Subsequence algorithm in PL/SQL function. I need only the length of the subsequence as output. Now my greatest challenge is the speed of the function when using the function for hundreds of thousands of records. There is a nested loop which fills the multi-dimensional table type object.
I have two tables one containing 1 million records (string) and other having almost 28,000 records(string). The requirement is to map each records in table 1 with each records of table 2 and get the LCS length for each.
According to my analysis the nested for loop is taking maximum amount of time.
/*
LCS LOGIC
*/
1 CREATE OR REPLACE
2 FUNCTION GET_LCS_LENGTH(
3 table1_string IN NVARCHAR2,
4 table2_string IN NVARCHAR2)
5 RETURN NUMBER
6 AS
7 TYPE t_number_array IS TABLE OF NUMBER INDEX BY BINARY_INTEGER;
8 TYPE t_bidimensional_number_array IS TABLE OF t_number_array INDEX BY BINARY_INTEGER;
9 matrix t_bidimensional_number_array ;
10 --...
11 BEGIN
12 len_str1 := LENGTH(table1_string);
13 len_str2 := LENGTH(table2_string);
14 matrix(1)(1) := 0;
15 FOR i IN 2..(len_str2+1)
16 LOOP
17 matrix(i)(1) := 0;
18 ch1 := SUBSTR(table2_string,i-1,1);
19 FOR j IN 2..(len_str1+1)
20 LOOP
21 matrix(1)(j) := 0;
22 ch2 := SUBSTR(table1_string,j-1,1);
23 IF ch1 = ch2 THEN
24 matrix(i)(j) := matrix(i - 1)(j - 1) + 1;
25 ELSE
26 matrix(i)(j) := greatest(matrix(i)(j - 1),matrix(i - 1)(j));
27 END IF;
28 END LOOP;
29 END LOOP;
30 lcs_Dist := matrix(len_str2+1)(len_str1+1);
31 matrix.DELETE;
32 END;
/*
LCS LOGIC END
*/
How can I replace this nested for loop to get the LCS length or use some different logic or how to optimize this code further?
Currently when one record in table 1 maps with each 28,000 records in table 2 , the time taken is 5 seconds which is costly for me.
It should run in fractions of seconds for 28,000 records then only I can achieve a certain goal with the other 1 million records.