I have two sets of data coming from external source - purchase date of the customer and last email click/open date of the customer. This is stored in two tables PURCHASE_INTER AND ACTIVITY_INTER tables respectively. Purchase data is in multiple and I need to pick up last purchase date. But activity data is unique for each customer. The data is independent of each other and other set of data may not be present. We have written below query which combines two tables, groups them based on person_id which is id of the customer coming from external source and get the latest of the dates, join with our customer table to get the customer email, and again join with another table where this data will be ultimately stored for sake of knowing whether it is insert or update operation. Could you please suggest how I can improve the performance of this query. It is terribly slow and taking more than 10 hours. There are millions of records coming in PURCHASE_INTER AND ACTIVITY_INTER tables.
SELECT INTER.*, C.ID AS CUSTOMER_ID, C.EMAIL AS CUSTOMER_EMAIL, LSI.ID AS INTERACTION_ID, ROW_NUMBER() OVER (ORDER BY PERSON_ID ASC) AS RN FROM (
SELECT PERSON_ID AS PERSON_ID,
MAX(LAST_CLICK_DATE) AS LAST_CLICK_DATE,
MAX(LAST_OPEN_DATE) AS LAST_OPEN_DATE,
MAX(LAST_PURCHASE_DATE) AS LAST_PURCHASE_DATE
FROM (
SELECT ACT.PERSON_ID AS PERSON_ID,
ACT.LAST_CLICK_DATE AS LAST_CLICK_DATE,
ACT.LAST_OPEN_DATE AS LAST_OPEN_DATE,
NULL AS LAST_PURCHASE_DATE
FROM ACTIVITY_INTER ACT
WHERE ACT.JOB_ID = 77318317
UNION
SELECT PUR.PERSON_ID AS PERSON_ID,
NULL AS LAST_CLICK_DATE,
NULL AS LAST_OPEN_DATE,
PUR.LAST_PURCHASE_DATE AS LAST_PURCHASE_DATE
FROM PURCHASE_INTER PUR
WHERE PUR.JOB_ID = 77318317
) GROUP BY PERSON_ID
) INTER LEFT JOIN CUSTOMER C ON INTER.PERSON_ID = C.PERSON_ID
LEFT JOIN INTERACTION LSI ON C.ID = LSI.CUSTOMER_ID;