Using a temporary table in a DB2 function created for parallel processing..?

Question

EDIT: I've added more at the end of the original question.

[Original question]:

In a DB2 SQL table function, can the scope of a declared global temporary table be limited to the current instance of the function..? This is DB2-for-i v7r3m0.

I'm planning a table function for analysis of character data. To do its thing, the function needs a temporary table in which to cache information which will then be repeatedly looped through in different ways until a result is reached. My first thought was to use a declared global temporary table, but there's that word "global".

For greater efficiency, I'd like to write the function with ALLOW PARALLEL, and possibly NOT FENCED, but these two options seem to be in conflict with the nature of a DGTT.

Also, because MODIFIES SQL DATA is needed to write to a DGTT, I'm not sure if there's any way I can use ALLOW PARALLEL, even with careful design of the function.

The documentation confuses me a bit; perhaps I'm not understanding something? In the opening paragraph it says:

Each application process that defines a declared temporary table of the same name has its own unique description of the temporary table.

And then under table-name it says:

If a declared temporary table, or an index or view that is dependent on a declared temporary table already exists with the same name, an error is returned.

So I would like to ask: can a DGTT or any other type of temporary table be created so its scope is limited only to the current instance of the function..?

I could muddle about with random names for the DGTT, but that would complexify the code a bit. Ideally I'd like to just use one name.

Another thought I had was to have a set of key columns in the DGTT that are populated by the input parameters of the function. Then all instances of the function could share one DGTT, but would only access their own rows.

Does anyone have any suggestions..?

EDIT:

Ok, I've worked on this for two days straight, and finally created a working example. There is an obvious excess of DGTTs; these were originally CTEs, but I converted them all for development to allow a result window after each step of the process.

And what started with the intention of being a scalar function with one input parameter, and one output, I think it could very satisfactorily be a table function with no input parameters.

What I ended up with turned out pretty efficient, and MUCH faster than my first couple of tries. In fact, I could very easily keep this as-is, but I would convert the DGTTs back to CTEs to see if there's any change in efficiency. Its possibly I would see no change, in which case there would be no DGTTs would thus would make my whole OP irrelevant, lol.

That being said, as anyone might see, there are a couple of opportunities to create loops that would greatly reduce the amount of repetitive code & the amount of code to process, and would also support any number of words...

-- THIS SCRIPT WAS NOT INTENDED TO BE THE FINAL PRODUCT, BUT MAY END UP CLOSE
-- DGTTs AND MULTIPLE RESULT WINDOWS WERE INTENDED TO EXAMININE EACH STEP
-- UNCOMMENT ANY OF THE SELECT STATEMENTS TO SEE WHAT'S GOING ON THERE 
-- THIS IS HARD-CODED TO PROCESS UP TO (9) WORDS
-- IF RE-WRITTEN WITH LOOPS, IT'D BE MORE COMPACT AND SUPPORT ANY NUMBER OF WORDS
DECLARE GLOBAL TEMPORARY TABLE SESSION.WORDS AS
    (   WITH    PASS1   AS (    SELECT      LPAD(TRIM(CHAR(BXUPCR)),14,'0')                     AS BXUPCR,
                                            UPPER(TRIM(REGEXP_SUBSTR(BXADDS,'[^ ]+',1,1,'i')))  AS W1,
                                            UPPER(TRIM(REGEXP_SUBSTR(BXADDS,'[^ ]+',1,2,'i')))  AS W2,
                                            UPPER(TRIM(REGEXP_SUBSTR(BXADDS,'[^ ]+',1,3,'i')))  AS W3,
                                            UPPER(TRIM(REGEXP_SUBSTR(BXADDS,'[^ ]+',1,4,'i')))  AS W4,
                                            UPPER(TRIM(REGEXP_SUBSTR(BXADDS,'[^ ]+',1,5,'i')))  AS W5,
                                            UPPER(TRIM(REGEXP_SUBSTR(BXADDS,'[^ ]+',1,6,'i')))  AS W6,
                                            UPPER(TRIM(REGEXP_SUBSTR(BXADDS,'[^ ]+',1,7,'i')))  AS W7,
                                            UPPER(TRIM(REGEXP_SUBSTR(BXADDS,'[^ ]+',1,8,'i')))  AS W8,
                                            UPPER(TRIM(REGEXP_SUBSTR(BXADDS,'[^ ]+',1,9,'i')))  AS W9
                                FROM        F_CERTOB.BEERXT AS BEERT
                                INNER JOIN  F_CERTOB.INVENT AS INVEN ON  BXITEM=IYITEM
                                WHERE       BXPACK > 999
                                AND         BXUPCR > 0
                                AND         BXUM    = 'CB'
                                AND         IYWHSE  = '1'
                                AND         IYSTAT IN ('A','B')
                            )
        --PASS #2: GROUP ITMES WITH IDENTICAL UPC & TEXT. THIS IS VERY FEW ITEMS OUT OF ALL.
        SELECT      BXUPCR, W1, W2, W3, W4, W5, W6, W7, W8, W9, COUNT(*) AS C
        FROM        PASS1
        GROUP BY    BXUPCR, W1, W2, W3, W4, W5, W6, W7, W8, W9
    )   WITH DATA WITH REPLACE @
--SELECT * FROM SESSION.WORDS ORDER BY 1,2,3,4,5,6,7,8,9,10 @

--NUMBER OF ITEMS PER UPC
DECLARE GLOBAL TEMPORARY TABLE SESSION.WCNT AS 
    (   SELECT      BXUPCR, COUNT(BXUPCR) AS UPC_COUNT
        FROM        SESSION.WORDS
        GROUP BY    BXUPCR
    )   WITH DATA WITH REPLACE @
--SELECT * FROM SESSION.WCNT ORDER BY 1 @

--FUNNEL CERTAIN ITEMS FOR TESTING    ------------FOR DEV ONLY
--THIS CAN BE REMOVED AND PATCHED AROUND LATER
DECLARE GLOBAL TEMPORARY TABLE SESSION.FUNNEL AS 
    (   SELECT      A.BXUPCR, A.W1, A.W2, A.W3, A.W4, A.W5, A.W6, A.W7, A.W8, A.W9
        FROM        SESSION.WORDS AS A
        INNER JOIN  SESSION.WCNT  AS B ON A.BXUPCR=B.BXUPCR
    )   WITH DATA WITH REPLACE @
--SELECT * FROM SESSION.FUNNEL ORDER BY 1 @

--GROUPS OF (1) WORD:
DECLARE GLOBAL TEMPORARY TABLE SESSION.GRP1 AS 
    (   SELECT          BXUPCR, W1, UPC_COUNT, COUNT(*) AS UPC_GROUP_COUNT, 1 AS THIS_GROUP
        FROM            SESSION.FUNNEL
        INNER JOIN      SESSION.WCNT USING(BXUPCR)
        WHERE           (W1) IS NOT NULL
        GROUP BY        BXUPCR, W1, UPC_COUNT
    HAVING          UPC_COUNT = COUNT(*)
    )   WITH DATA WITH REPLACE @
--SELECT * FROM SESSION.GRP1 ORDER BY 1 @

--GROUPS OF (2) WORDS:
DECLARE GLOBAL TEMPORARY TABLE SESSION.GRP2 AS 
    (   SELECT          BXUPCR, W1, W2, UPC_COUNT, COUNT(*) AS UPC_GROUP_COUNT, 2 AS THIS_GROUP
        FROM            SESSION.FUNNEL
        INNER JOIN      SESSION.WCNT USING(BXUPCR)
        WHERE           (W1 || W2) IS NOT NULL
        GROUP BY        BXUPCR, W1, W2, UPC_COUNT
        HAVING          UPC_COUNT = COUNT(*)
    )   WITH DATA WITH REPLACE @
--SELECT * FROM SESSION.GRP2 ORDER BY 1 @

--GROUPS OF (3) WORDS:
DECLARE GLOBAL TEMPORARY TABLE SESSION.GRP3 AS 
    (   SELECT          BXUPCR, W1, W2, W3, UPC_COUNT, COUNT(*) AS UPC_GROUP_COUNT, 3 AS THIS_GROUP
        FROM            SESSION.FUNNEL
        INNER JOIN      SESSION.WCNT USING(BXUPCR)
        WHERE           (W1 || W2 || W3) IS NOT NULL
        GROUP BY        BXUPCR, W1, W2, W3, UPC_COUNT
        HAVING          UPC_COUNT = COUNT(*)
    )   WITH DATA WITH REPLACE @
--SELECT * FROM SESSION.GRP3 ORDER BY 1 @

--GROUPS OF (4) WORDS:
DECLARE GLOBAL TEMPORARY TABLE SESSION.GRP4 AS 
    (   SELECT          BXUPCR, W1, W2, W3, W4, UPC_COUNT, COUNT(*) AS UPC_GROUP_COUNT, 4 AS THIS_GROUP
        FROM            SESSION.FUNNEL
        INNER JOIN      SESSION.WCNT USING(BXUPCR)
        WHERE           (W1 || W2 || W3 || W4) IS NOT NULL
        GROUP BY        BXUPCR, W1, W2, W3, W4, UPC_COUNT
        HAVING          UPC_COUNT = COUNT(*)
    )   WITH DATA WITH REPLACE @
--SELECT * FROM SESSION.GRP4 ORDER BY 1 @

--GROUPS OF (5) WORDS:
DECLARE GLOBAL TEMPORARY TABLE SESSION.GRP5 AS 
    (   SELECT          BXUPCR, W1, W2, W3, W4, W5, UPC_COUNT, COUNT(*) AS UPC_GROUP_COUNT, 5 AS THIS_GROUP
        FROM            SESSION.FUNNEL
        INNER JOIN      SESSION.WCNT USING(BXUPCR)
        WHERE           (W1 || W2 || W3 || W4 || W5) IS NOT NULL
        GROUP BY        BXUPCR, W1, W2, W3, W4, W5, UPC_COUNT
        HAVING          UPC_COUNT = COUNT(*)
    )   WITH DATA WITH REPLACE @
--SELECT * FROM SESSION.GRP5 ORDER BY 1 @

--GROUPS OF (6) WORDS:
DECLARE GLOBAL TEMPORARY TABLE SESSION.GRP6 AS 
    (   SELECT          BXUPCR, W1, W2, W3, W4, W5, W6, UPC_COUNT, COUNT(*) AS UPC_GROUP_COUNT, 6 AS THIS_GROUP
        FROM            SESSION.FUNNEL
        INNER JOIN      SESSION.WCNT USING(BXUPCR)
        WHERE           (W1 || W2 || W3 || W4 || W5 || W6) IS NOT NULL
        GROUP BY        BXUPCR, W1, W2, W3, W4, W5, W6, UPC_COUNT
        HAVING          UPC_COUNT = COUNT(*)
    )   WITH DATA WITH REPLACE @
--SELECT * FROM SESSION.GRP6 ORDER BY 1 @

--GROUPS OF (7) WORDS:
DECLARE GLOBAL TEMPORARY TABLE SESSION.GRP7 AS 
    (   SELECT          BXUPCR, W1, W2, W3, W4, W5, W6, W7, UPC_COUNT, COUNT(*) AS UPC_GROUP_COUNT, 7 AS THIS_GROUP
        FROM            SESSION.FUNNEL
        INNER JOIN      SESSION.WCNT USING(BXUPCR)
        WHERE           (W1 || W2 || W3 || W4 || W5 || W6 || W7) IS NOT NULL
        GROUP BY        BXUPCR, W1, W2, W3, W4, W5, W6, W7, UPC_COUNT
        HAVING          UPC_COUNT = COUNT(*)
    )   WITH DATA WITH REPLACE @
--SELECT * FROM SESSION.GRP7 ORDER BY 1 @

--GROUPS OF (8) WORDS:
DECLARE GLOBAL TEMPORARY TABLE SESSION.GRP8 AS 
    (   SELECT          BXUPCR, W1, W2, W3, W4, W5, W6, W7, W8, UPC_COUNT, COUNT(*) AS UPC_GROUP_COUNT, 8 AS THIS_GROUP
        FROM            SESSION.FUNNEL
        INNER JOIN      SESSION.WCNT USING(BXUPCR)
        WHERE           (W1 || W2 || W3 || W4 || W5 || W6 || W7 || W8) IS NOT NULL
        GROUP BY        BXUPCR, W1, W2, W3, W4, W5, W6, W7, W8, UPC_COUNT
        HAVING          UPC_COUNT = COUNT(*)
    )   WITH DATA WITH REPLACE @
--SELECT * FROM SESSION.GRP8 ORDER BY 1 @

--GROUPS OF (9) WORDS:
DECLARE GLOBAL TEMPORARY TABLE SESSION.GRP9 AS 
    (   SELECT          BXUPCR, W1, W2, W3, W4, W5, W6, W7, W8, W9, UPC_COUNT, COUNT(*) AS UPC_GROUP_COUNT, 9 AS THIS_GROUP
        FROM            SESSION.FUNNEL
        INNER JOIN      SESSION.WCNT USING(BXUPCR)
        WHERE           (W1 || W2 || W3 || W4 || W5 || W6 || W7 || W8 || W9) IS NOT NULL
        GROUP BY        BXUPCR, W1, W2, W3, W4, W5, W6, W7, W8, W9, UPC_COUNT
        HAVING          UPC_COUNT = COUNT(*)
    )   WITH DATA WITH REPLACE @
--SELECT * FROM SESSION.GRP9 ORDER BY 1 @

--MERGE EVERYTHING TOGETHER:
DECLARE GLOBAL TEMPORARY TABLE SESSION.UNIONX AS 
    (   SELECT      BXUPCR, W1 AS W1, '' AS W2, '' AS W3, '' AS W4, '' AS W5, '' AS W6, '' AS W7, '' AS W8, '' AS W9, 1 AS WC
        FROM        SESSION.GRP1
        UNION
        SELECT      BXUPCR, W1 AS W1, W2 AS W2, '' AS W3, '' AS W4, '' AS W5, '' AS W6, '' AS W7, '' AS W8, '' AS W9, 2 AS WC
        FROM        SESSION.GRP2
        UNION
        SELECT      BXUPCR, W1 AS W1, W2 AS W2, W3 AS W3, '' AS W4, '' AS W5, '' AS W6, '' AS W7, '' AS W8, '' AS W9, 3 AS WC
        FROM        SESSION.GRP3
        UNION
        SELECT      BXUPCR, W1 AS W1, W2 AS W2, W3 AS W3, W4 AS W4, '' AS W5, '' AS W6, '' AS W7, '' AS W8, '' AS W9, 4 AS WC
        FROM        SESSION.GRP4
        UNION
        SELECT      BXUPCR, W1 AS W1, W2 AS W2, W3 AS W3, W4 AS W4, W5 AS W5, '' AS W6, '' AS W7, '' AS W8, '' AS W9, 5 AS WC
        FROM        SESSION.GRP5
        UNION
        SELECT      BXUPCR, W1 AS W1, W2 AS W2, W3 AS W3, W4 AS W4, W5 AS W5, W6 AS W6, '' AS W7, '' AS W8, '' AS W9, 6 AS WC
        FROM        SESSION.GRP6
        UNION
        SELECT      BXUPCR, W1 AS W1, W2 AS W2, W3 AS W3, W4 AS W4, W5 AS W5, W6 AS W6, W7 AS W7, '' AS W8, '' AS W9, 7 AS WC
        FROM        SESSION.GRP7
        UNION
        SELECT      BXUPCR, W1 AS W1, W2 AS W2, W3 AS W3, W4 AS W4, W5 AS W5, W6 AS W6, W7 AS W7, W8 AS W8, '' AS W9, 8 AS WC
        FROM        SESSION.GRP8
        UNION
        SELECT      BXUPCR, W1 AS W1, W2 AS W2, W3 AS W3, W4 AS W4, W5 AS W5, W6 AS W6, W7 AS W7, W8 AS W8, W9 AS W9, 9 AS WC
        FROM        SESSION.GRP9
        GROUP BY    BXUPCR, W1, W2, W3, W4, W5, W6, W7, W8, W9
    )   WITH DATA WITH REPLACE @
--SELECT * FROM SESSION.UNIONX ORDER BY BXUPCR,W1,W2,W3,W4,W5,W6,W7,W8,W9 @

--FINAL RESULT:
DECLARE GLOBAL TEMPORARY TABLE SESSION.PICKER AS
    (   SELECT      BXUPCR, TRIM(W1 || ' ' || W2 || ' ' || W3 || ' ' || W4 || ' ' || W5 || ' ' || W6 || ' ' || W7 || ' ' || W8 || ' ' || W9) AS WORDS
        FROM        SESSION.UNIONX
        INNER JOIN  TABLE(  SELECT      BXUPCR, MAX(WC) AS WC
                            FROM        SESSION.UNIONX
                            GROUP BY    BXUPCR
                         )  AS TMP USING(BXUPCR, WC)
    )   WITH DATA WITH REPLACE @
SELECT * FROM SESSION.PICKER ORDER BY 1 @

A DGTT can be accessed in parallel as part of the normal insert/select/etc, just like any other table, so any statements are potentially parallizable per-row, as usual. You also _usually_ want to avoid explicit loops, because of this. Can we get a little more detail on what you're trying to accomplish here? — Clockwork-Muse, Sep 16 '19 at 23:24
You have a problem - if words are in different orders among two instances of the same UPC, they'll be different rows. More generally, you want words to be in the same order for identical numbers of words. What does `BXADDS` represent? If it's a space-delimited list of values (violating normal forms isn't great), your best bet is probably defining a table function to spit out _rows of words_. (This would be easier on LUW). Which would pretty much automatically get you infinite words (and pivoting will be a little easier, too). — Clockwork-Muse, Sep 19 '19 at 18:57
BXADDS is the product description as a human-readable title, such as "Mike's Harder Black Cherry Lemonade". We're pretty good about keeping the words in proper order; we must or our 50+ route salesmen would yell at us that they don't appear/sort properly in the product list. Also, you've given me another idea: what if words are common, but appear at different positions? Such as "Mike's Harder Black Cherry Lemonade" and "Mike's Harder Watermelon Lemonade"..? The common words would be "Mike's Harder Lemonade"; the unique words in the middle are skipped. — spinjector, Sep 19 '19 at 20:05
...what is it you're actually trying to accomplish here? What is it you need the information for? If you just want to query for all products matching some set of words, you're looking at some type of full-text search (DB2 for i _may_ have some native capability). If you're trying to find common elements for something like reporting, you can't escape splitting [(An example using a recursive CTE)](https://stackoverflow.com/a/55111039/812837), but you might be able to do something like cache the split into a permanent tag table. — Clockwork-Muse, Sep 19 '19 at 20:21
I have a report. It gets sales data for 4 months previous to the 1st. Any item sold to an account in the 4th month that wasn't in 1-3 is considered "new". Previously it was keyed by item number, but I was asked to change it to use the UPC. But due to the ways of "UPC", many products have the same code. We have one brand where 11 items have the same UPC. And each product has two codes: package & container, with one taking priority *per account*. I made it work, but it's only numbers. This last part is an attempt to get "common words" for items with same UPC, to make it friendly to human eyes. — spinjector, Sep 19 '19 at 20:41
...Doing this fancy word stuff would seem (to me) to create _more_ confusion. It's not out of the realm of possibility that there would only be 1 (or even _no_) common word for a given UPC. Would you prefer some sort of rollup? That is, the total for the UPC, and the subtotals for each item number? Otherwise, is there some master UPC file you can use for the description? I mean, if you're scanning a UPC, that's the only thing a register has to key for printing the receipt, so where do those descriptions come from? — Clockwork-Muse, Sep 19 '19 at 20:51

score 0 · Answer 1 · answered Sep 17 '19 at 13:28

A temporary table created with Declare global temporary table is implemented in QTEMP which is unique per job, not per thread. That said, IBM i performs caching behind the scenes, and is likely much better at it than you could ever be, so often the attempt to cache data is just wasted cycles and memory caching things that are already cached. I suggest trying to perform your process without manual caching, and then if you need improved performance somewhere, start with indexes.

SQL is a declarative language, not a procedural one. That means the best way to do things is to just tell it about the result you want, and let the optimizer decide how to get it. Think in sets not procedures.

I've added an example to the end of my OP. It's quite possible that I've been able to develop the DGTT's right out of it. — spinjector, Sep 18 '19 at 14:37

score 0 · Answer 2 · answered Sep 17 '19 at 17:58

0

I would like to ask: can a DGTT or any other type of temporary table be created so its scope is limited only to the current instance of the function..?

No.

If a declared temporary table, or an index or view that is dependent on a declared temporary table already exists with the same name, an error is returned.

You may avoid such a error with the WITH REPLACE clause. The corresponding DGTT is dropped silently and recreated if it exists in this case.
So, just use this clause in your DGTT declaration, if you don't need old contents of this DGTT.

answered Sep 17 '19 at 17:58

Mark Barinstein

11,456
2
8
16

Yes, I use `WITH REPLACE` regularly. However, my concern was multiple instances of the function running in parallel clashing with each other and walking all over different data in the same DGTT. FYI - I've added an example to the end of my OP. – spinjector Sep 18 '19 at 14:36
If you are going to use a function working with DGTTs multiple times `in the same query`, you should obviously keep it in mind, that such a function shouldn't do the things like dropping / creating / modifying any "non-private" parts of the same DGTT. You may avoid such an interference with unique DGTT for each function invocation or using some "private key" for the same DGTT. – Mark Barinstein Sep 18 '19 at 17:25
I thought the same thing. I was thinking of generating a random id like a GUID or UUID for each instance of the DGTT, and then either use that as a unique name for the DGTT (which then becomes dynamic SQL), or use the same DGTT for all instances to dump their junk in, but add an "instance" column for the generated id so each instance keeps its junk straight. Deeper & deeper... – spinjector Sep 18 '19 at 18:17

Using a temporary table in a DB2 function created for parallel processing..?

2 Answers2