---------------
MASTER TABLE
---------------
DATA_KEY NUMBER
TEXT VARCHAR2(2000)
ORDER_NO NUMBER
---------------
DETAIL TABLE
---------------
DATA_KEY NUMBER
SIMILAR_DATA_KEY NUMBER
DISTANCE_COUNT NUMBER
---------------
INSERT QUERY
---------------
INSERT INTO DETAIL
(
SELECT DATA_KEY, SIMILAR_DATA_KEY, DISTANCE_COUNT
FROM
(
SELECT A.DATA_KEY AS DATA_KEY, B.DATA_KEY AS SIMILAR_DATA_KEY,
UTL_MATCH.EDIT_DISTANCE(A.TEXT, B.TEXT) AS DISTANCE_COUNT
FROM
(SELECT DATA_KEY, TEXT, ORDER_NO FROM MASTER) A
INNER JOIN
(SELECT DATA_KEY, TEXT, ORDER_NO FROM MASTER) B
ON (A.ORDER_NO < B.ORDER_NO)
)
WHERE DISTANCE_COUNT <= 5
)
I need compare MASTER table TEXT field with other TEXT field.
indexes are not exist. master table 90,000 rows.
ORDER_NO field is for avoid duplicated compare. (1 .. 90000)
=============================================================
A.ORDER_NO < B.ORDER_NO
------------------------------------------
1, 1 <- exclude
1, 2 <- join
1, 3 <- join
1, 4 <- join
..
2, 1 <- exclude
2, 2 <- exclude
2, 3 <- join
2, 4 <- join
...
3, 1 <- exclude
3, 2 <- exclude
3, 3 <- exclude
3, 4 <- join
1. NOT need compare 1 and 1
2. Need compare 1 and 2
3. NOT need compare 2 and 1 (because, duplicate 2.)
so, for decrease compare count...
=============================================================
Slow zone is (WHERE DISTANCE_COUNT <= 5
) ?
Slow zone is comparing rows (90000*89999/2
) ?
Query elapse time is 7 days.
6,000 rows inserted to DETAIL table.
How to speed up?
I'm sorry for poor English...