0

I have a below query to execute in Oracle, which is taking more time due to high volume of data. CUSTOMER_ADDRESS table has close to 450K records and CONTROL_REC has close to 1.7M records. It may be because of GROUP BY and a subquery. Even after using parallel threads, it is not working.

Can someone, please help to simplify the query to run faster?

SELECT /*+ parallel(A,8) */ 
A.SETID, A.CUST_ID, A.ADDRESS_SEQ_NUM,
A.ALT_NAME1, A.ALT_NAME2,  
A.LANGUAGE_CD, A.COUNTRY, A.ADDRESS1,
A.ADDRESS2, A.ADDRESS3, A.ADDRESS4, 
A.CITY, A.NUM1, A.NUM2, A.ADDR_FIELD1,
A.ADDR_FIELD2, A.ADDR_FIELD3, 
A.COUNTY, A.STATE, A.POSTAL, 
A.IN_CITY_LIMIT, A.COUNTRY_CODE, 
A.PHONE, A.EXTENSION, A.FAX, 
B.SETCNTRLVALUE, MAX(A.EFFDT) AS EFFDT 
FROM CUSTOMER_ADDRESS A, 
     CONTROL_REC B 
WHERE B.RECNAME = 'CUST_ADDRESS' 
AND A.EFFDT = (
    SELECT MAX(A_ED.EFFDT) 
    FROM CUSTOMER_ADDRESS A_ED 
    WHERE A.SETID = A_ED.SETID 
    AND A.CUST_ID = A_ED.CUST_ID 
    AND A.ADDRESS_SEQ_NUM = A_ED.ADDRESS_SEQ_NUM 
    AND A_ED.EFFDT <= SYSDATE) 
AND A.SETID = B.SETID 
GROUP BY A.SETID, A.CUST_ID, 
    A.ADDRESS_SEQ_NUM, A.ALT_NAME1, 
    A.ALT_NAME2, A.LANGUAGE_CD, 
    A.COUNTRY, A.ADDRESS1, A.ADDRESS2,
    A.ADDRESS3, A.ADDRESS4, A.CITY, 
    A.NUM1, A.NUM2, A.ADDR_FIELD1, 
    A.ADDR_FIELD2, A.ADDR_FIELD3, 
    A.COUNTY, A.STATE, A.POSTAL, 
    A.IN_CITY_LIMIT, A.COUNTRY_CODE, 
    A.PHONE, A.EXTENSION, A.FAX, B.SETCNTRLVALUE;
Jon Heller
  • 34,999
  • 6
  • 74
  • 132
Pranav
  • 363
  • 8
  • 19

2 Answers2

0

Try converting the self-join into an analytic function. The analytic function avoids reading from the same table twice, simplifies the code, and simplifies the execution plan. You may also want to convert the object-level parallelism, /*+ parallel(A,8) */ to statement-level parallelism, /*+ parallel(8) */. If one part of the query runs in parallel, it's usually best to have all parts of the query run in parallel.

SELECT /*+ parallel(8) */ 
A.SETID, A.CUST_ID, A.ADDRESS_SEQ_NUM,
A.ALT_NAME1, A.ALT_NAME2,  
A.LANGUAGE_CD, A.COUNTRY, A.ADDRESS1,
A.ADDRESS2, A.ADDRESS3, A.ADDRESS4, 
A.CITY, A.NUM1, A.NUM2, A.ADDR_FIELD1,
A.ADDR_FIELD2, A.ADDR_FIELD3, 
A.COUNTY, A.STATE, A.POSTAL, 
A.IN_CITY_LIMIT, A.COUNTRY_CODE, 
A.PHONE, A.EXTENSION, A.FAX, 
B.SETCNTRLVALUE, MAX(A.EFFDT) AS EFFDT 
FROM
    (
        SELECT CUSTOMER_ADDRESS.*,
            MAX(CASE WHEN EFFDT <= SYSDATE THEN EFFDT ELSE NULL END)
                OVER (PARTITION BY SETID, CUST_ID, ADDRESS_SEQ_NUM) MAX_EFFDT
        FROM CUSTOMER_ADDRESS
    ) A
JOIN CONTROL_REC B 
    ON A.SETID = B.SETID 
WHERE B.RECNAME = 'CUST_ADDRESS' 
    AND A.EFFDT = A.MAX_EFFDT
GROUP BY A.SETID, A.CUST_ID, 
    A.ADDRESS_SEQ_NUM, A.ALT_NAME1, 
    A.ALT_NAME2, A.LANGUAGE_CD, 
    A.COUNTRY, A.ADDRESS1, A.ADDRESS2,
    A.ADDRESS3, A.ADDRESS4, A.CITY, 
    A.NUM1, A.NUM2, A.ADDR_FIELD1, 
    A.ADDR_FIELD2, A.ADDR_FIELD3, 
    A.COUNTY, A.STATE, A.POSTAL, 
    A.IN_CITY_LIMIT, A.COUNTRY_CODE, 
    A.PHONE, A.EXTENSION, A.FAX, B.SETCNTRLVALUE;

If the above doesn't work, then you'll want to generate an an execution plan with the the GATHER_PLAN_STATISTICS hint. With the execution plan, someone can probably tell you which parts of the query run slow, and why.

Jon Heller
  • 34,999
  • 6
  • 74
  • 132
0

you would need to give the index(es) you have on those tables. From the given deterministic column predicates, B seems to be your driving table. so the /*+ parallel(A,8) */ *could * induce an FTS on the inner table rendering any indexed leading columns in the inner subquery useless

indhar
  • 1
  • 1