2

Have legacy complex SQL script in form of

WITH
A AS (...<SUB_QA>...),
B AS (...<SUB_QB>...),
C AS (...<SUB_QC>...),
...

SELECT ... FROM
A 
LEFT JOIN B 
LEFT JOIN C
LEFT JOIN ...
ON ....

Trying to extract the SUB_QA, SUB_QB ... into functions, since each of them will be used by multiple scripts; and each of the calling script will pass one or more variable with different values to them (e.g. the 'last_name' in sample code below). So far, I got the following:

Create Sample Data:

    --------PERSON table------------
    DROP TABLE Test_Persons;
    CREATE TABLE Test_Persons (
        PersonID int,
        LastName varchar2(255),
        FirstName varchar2(255)
    );

    INSERT INTO Test_Persons
        (PersonID,LastName,FirstName)
        values(1,'LN_1','FN_1');

    INSERT INTO Test_Persons
        (PersonID,LastName,FirstName)
        values(2,'LN_2','FN_2');

    INSERT INTO Test_Persons
        (PersonID,LastName,FirstName)
        values(3,'LN_21','FN_2');

    --------Salary table------------
    DROP TABLE TEST_SALARY_A;
    CREATE TABLE TEST_SALARY_A ( -- no 'OR REPLACE' for ORACLE
        SalaryID int,
        PersonID int,
        Amount int,
        Tax int,
        Bank varchar2(20)
    );

    INSERT INTO TEST_SALARY_A
        (SalaryID, PersonID, Amount, Tax, Bank)
        VALUES
        (1, 1, 1000, 300, 'BOA1');

    INSERT INTO TEST_SALARY_A
        (SalaryID, PersonID, Amount, Tax, Bank)
        VALUES
        (2, 2, 2000, 600, 'JP1');

    INSERT INTO TEST_SALARY_A
        (SalaryID, PersonID, Amount, Tax, Bank)
        VALUES
        (3, 3, 3000, 900, 'TD1');

    --------Address table------------
    DROP TABLE TEST_ADDRESS_A;
    CREATE TABLE TEST_ADDRESS_A ( 
        AddressID int,
        PersonID int,
        Address varchar2(255)
    );

    INSERT INTO TEST_ADDRESS_A
        (AddressID, PersonID, Address)
        VALUES
        (1, 1, 'address1');

    INSERT INTO TEST_ADDRESS_A
        (AddressID, PersonID, Address)
        VALUES
        (2, 2, 'address2');

    INSERT INTO TEST_ADDRESS_A
        (AddressID, PersonID, Address)
        VALUES
        (3, 3, 'address3');

    commit;

The Original WITH ... SELECT .... form:

--------------------------1. Multiple Subqueries in a large WITH ... SELECT ... block ------------------------------
WITH 
TEMP1 AS (
    SELECT
        p.PERSONID,
        p.LASTNAME,
        a.ADDRESS
    FROM TEST_PERSONS p
    LEFT JOIN TEST_ADDRESS_A a ON p.PERSONID = a.PERSONID
    WHERE p.LASTNAME = 'LN_1'
),
TEMP2 AS (
    SELECT
        p.PERSONID,
        s.TAX
    FROM TEST_PERSONS p
    LEFT JOIN TEST_SALARY_A s ON p.PERSONID = s.PERSONID
    WHERE p.LASTNAME = 'LN_1'
),
-- TEMP3 is from TEMP1 left join TEMP2
TEMP3 AS (
    SELECT
        t1.PERSONID,
        t1.LASTNAME,
        t1.ADDRESS,
        t2.TAX
    FROM TEMP1 t1
    LEFT JOIN TEMP2 t2 ON t1.PERSONID = t2.PERSONID
    WHERE t1.LASTNAME = 'LN_1'
)

---------------------RESULT QUERY 1: SELECT via WITH-----------------------
SELECT t3.PERSONID, ADDRESS, t3.TAX 
FROM TEMP2 t2 
LEFT JOIN TEMP3 t3 ON t3.PERSONID = t2.PERSONID
WHERE t3.LASTNAME = 'LN_1';

Rewrite to function 'chain'

--------------------------2. Rewritten in Function 'chain' ------------------------------
CREATE OR REPLACE PACKAGE MY_JOIN_TEST_SP_PACKAGE_4 AS  -- PACKAGE Decalaration
    TYPE type1 IS RECORD(
      PersonID int,
      LastName varchar2(255),
      Address varchar2(255)
    );
    TYPE tab_type1 IS TABLE OF type1;

    TYPE type2 IS RECORD(
      PersonID int,
      Tax int
    );
    TYPE tab_type2 IS TABLE OF type2;

    TYPE type3 IS RECORD(
      PersonID int,
      LastName varchar2(255),
      Address varchar2(255),
      Tax int
    );
    TYPE tab_type3 IS TABLE OF type3;

    FUNCTION func1(last_name VARCHAR2)
        RETURN tab_type1
        PIPELINED;

    FUNCTION func2(last_name VARCHAR2)
        RETURN tab_type2
        PIPELINED;

    FUNCTION func3(last_name VARCHAR2)
        RETURN tab_type3
        PIPELINED;
END;
/

CREATE OR REPLACE PACKAGE BODY MY_JOIN_TEST_SP_PACKAGE_4 AS

    --- func1 ---
    FUNCTION func1(last_name VARCHAR2)
        RETURN tab_type1
        PIPELINED 
        AS
        rec1 type1;
        BEGIN
            FOR x IN (
                SELECT
                    p.PERSONID,
                    p.LASTNAME,
                    a.ADDRESS
                FROM TEST_PERSONS p
                LEFT JOIN TEST_ADDRESS_A a ON p.PERSONID = a.PERSONID
                WHERE p.LASTNAME = last_name   
             )
          LOOP
            SELECT x.PERSONID, x.LASTNAME, x.ADDRESS
                INTO rec1
                FROM DUAL;
            PIPE ROW (rec1);
          END LOOP;
        END;

    --- func2 ---
    FUNCTION func2(last_name VARCHAR2)
        RETURN tab_type2
        PIPELINED 
        AS
        rec2 type2;
        BEGIN
            FOR x IN (
                SELECT
                    p.PERSONID,
                    s.TAX
                FROM TEST_PERSONS p
                LEFT JOIN TEST_SALARY_A s ON p.PERSONID = s.PERSONID
                WHERE p.LASTNAME = last_name 
             )
          LOOP
            SELECT x.PERSONID, x.TAX
                INTO rec2
                FROM DUAL;
            PIPE ROW (rec2);
          END LOOP;
        END;

    --- func3 ---
    FUNCTION func3(last_name VARCHAR2)
        RETURN tab_type3
        PIPELINED 
        AS
        rec3 type3;
        BEGIN
            FOR x IN (
                SELECT
                    t1.PERSONID,
                    t1.LASTNAME,
                    t1.ADDRESS,
                    t2.TAX
                FROM table(func1(last_name)) t1  -- !! call func1(...) !! -- 
                LEFT JOIN table(func2(last_name)) t2 ON t1.PERSONID = t2.PERSONID -- !! call func2(...) !! --
                WHERE t1.LASTNAME = last_name
             )
          LOOP
            SELECT x.PERSONID, x.LASTNAME, x.ADDRESS, x.TAX
                INTO rec3
                FROM DUAL;
            PIPE ROW (rec3);
          END LOOP;
        END;

END; -- END of CREATE
/

--------------------RESULT QUERY 2: SELECT via FUNCTIONS-----------------------
SELECT t3.PERSONID, ADDRESS, t3.TAX 
FROM TABLE(MY_JOIN_TEST_SP_PACKAGE_4.func2('LN_1')) t2 
LEFT JOIN TABLE(MY_JOIN_TEST_SP_PACKAGE_4.func3('LN_1')) t3 ON t3.PERSONID = t2.PERSONID -- func3(...) will call func2(...) again with the same param, was func2 result cached? 
WHERE t3.LASTNAME = 'LN_1';

QUESTION:

  1. Is there any cleaner/more short-handed way to achieve this?
  2. Is 'RESULT QUERY 2' (chained function calls, is calling function result cached?) significantly less efficient than 'RESULT QUERY 1'? (I failed to manage to get any meaningful Explain Plan info in SqlDeveloper)
  3. In this kind of 'Chained' function calls, is PIPE ROW (...) going to be an advantage or complication in term of performance if it is at multiple levels of the call stack?
Stochastika
  • 305
  • 1
  • 2
  • 14
  • 3
    Why do you want to use functions? They tend to kill performance. Views s hould work. – Gordon Linoff Jul 23 '18 at 21:51
  • 2
    What was wrong with you CTEs? Were you getting a specific error or seeing a performance issue? Gordon is absolutely right here. You're going to kill your performance trying to change those to functions. The optimizer is very capable of reading and running those CTEs. – Shawn Jul 23 '18 at 22:09
  • 3
    I would look at your overall query and determine what could be combined in your result data. It looks like you may be repeating essentially the same thing to get different pieces of data. You may be able to combine all of those queries. Regardless, the CTEs will still likely perform better than converting to functions. So 1) I think so, through combining queries, 2) No. Optimizer knows what it needs to do. 3) CTEs are generally an advantage. Overall query may be improvable. Test. – Shawn Jul 23 '18 at 22:14
  • Thanks guys. But 1. the logic encapsulated in each of the SUB_QA, SUB_QB, etc will be 'called' upon by multiple scripts (hence presumably CTE will not be so suitable). 2. each of the SUB_QA, SUB_QB, etc will have one or more variables passed to them from different scripts (hence presumably View will not be so suitable). At least that's what I assume. – Stochastika Jul 24 '18 at 01:55
  • If my above assumption are correct, then any suggestion of some kind of compromised solution to fit these constraints as well as performances? – Stochastika Jul 24 '18 at 01:58
  • You're looking to make things more "efficient". Please define "efficiency". Thanks. – Bob Jarvis - Слава Україні Jul 24 '18 at 02:46
  • @Bob Jarvis, one of my question was 'Is 'RESULT QUERY 2' (chained function calls, is calling function result cached?) significantly less efficient than 'RESULT QUERY 1'?'. If so, then I would like to know if there is a more efficient way, i.e. quicker in term of execution time, repeatedly, to achieve what the 'chained function' way can achieve (considering all the other aspects we have been discussing in the comments above) – Stochastika Jul 24 '18 at 04:07

0 Answers0