Wea re on Teradata 14. 2 Tables being LOJ . Each table PI has ONLY 1.5% SKEW . When ANY of these tables is being redestributed on the others Join key the spool table has a 98% skew and the query will just HANG UP on the merge join step
sel A.x , count ('1') from A left outer join B on A.x=B.y where B.y is NULL
This was production code ( already written in past ) that failed. I can rewrite it using an exclusion join and it works just fine. But Changing it is not an option. It redistribites B.y by A.x. So if I create a table in spool , with the B.* and A.x ( A.x is PI ) there's a 98% skew. There is a 98% skew for vis versa too. A.*, B.y ( B.y is redistrubuting column ) Stats are upto date. WITHOUT changing the orignal query - only other thing that I could do is run a diagnostic statement before the query and then it will avoid the redistribution step( run in < 5 s ). The other thing I could consider is drop stats on join column ( ? ) with the hope that the optimizer will avoid redistributing one table by the other ? My question is how can I get the optimizer to choose the alternate plan that it ran when I ran the diagnostic statement. from PDCR the tables being joined had same size and the query ran in the past ( though had a good amt of spool ) . Now it hangs up.This is a small 50 AMP system
Update to the comment : Here is the explain WITHOUT diagnostic. It will hang on the merge join step
Explain
--diagnostic verboseexplain on for session ;
SELECT F.keyX ,
COUNT('1')
FROM DB.TB1 F LEFT OUTER JOIN DB.TB2 D
ON F.keyX = D.keyY
where (F.keyX IS NOT NULL
AND D.keyY IS NULL )
GROUP BY F.keyX;
This query is optimized using type 2 profile insert-sel, profileid
10001.
1) First, we lock a distinct DB."pseudo table" for read on a
RowHash to prevent global deadlock for DB.D.
2) Next, we lock a distinct DB."pseudo table" for read on a
RowHash to prevent global deadlock for DB.F.
3) We lock DB.D for read, and we lock DB.F for read.
4) We do an all-AMPs RETRIEVE step from DB.F by way of an
all-rows scan with no residual conditions into Spool 4 (all_amps)
(compressed columns allowed), which is redistributed by hash code
to all AMPs with hash fields ("DB.F.keyX")
and Field1 ("DB.F.ROWID"). Then we do a SORT to order
Spool 4 by row hash. The size of Spool 4 is estimated with high
confidence to be 270,555 rows (5,952,210 bytes). Spool AsgnList:
"Field_1" = "DB.F.ROWID",
"keyX" = "DB.F.keyX".
5) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from Spool 4 by way of an
all-rows scan into Spool 5 (all_amps) (compressed columns
allowed), which is duplicated on all AMPs with hash fields (
"Spool_4.keyX") and Field1 ("Spool_4.Field_1").
The size of Spool 5 is estimated with high confidence to be
24,349,950 rows (535,698,900 bytes). Spool AsgnList:
"Field_1" = "Spool_4.Field_1",
"keyX" = "Spool_4.keyX".
The estimated time for this step is 0.42 seconds.
2) We do an all-AMPs RETRIEVE step from DB.D by way of
an all-rows scan with no residual conditions into Spool 6
(all_amps) (compressed columns allowed), which is built
locally on the AMPs with hash fields (
"DB.D.keyY"). The size of Spool 6 is
estimated with high confidence to be 307,616 rows (5,229,472
bytes). Spool AsgnList:
"keyY" = "DB.D.keyY".
The estimated time for this step is 0.02 seconds.
6) We do an all-AMPs JOIN step (Global sum) from Spool 6 (Last Use)
by way of an all-rows scan, which is joined to Spool 5 (Last Use)
by way of an all-rows scan. Spool 6 is used as the hash table and
Spool 5 is used as the probe table in a joined using a single
partition classical hash join, with a join condition of (
"Spool_5.keyX = Spool_6.keyY"). The result
goes into Spool 7 (all_amps) (compressed columns allowed), which
is redistributed by hash code to all AMPs with hash fields (
"Spool_5.keyX") and Field1 ("Spool_5.Field_1"). Then
we do a SORT to order Spool 7 by row hash. The size of Spool 7 is
estimated with low confidence to be 270,555 rows (7,034,430 bytes).
Spool AsgnList:
"Field_1" = "Spool_5.Field_1",
"keyY" = "{LeftTable}.keyY",
"Field_3" = "Spool_5.keyX".
7) We do an all-AMPs JOIN step (Global sum) from Spool 7 (Last Use)
by way of a RowHash match scan, which is joined to Spool 4 (Last
Use) by way of a RowHash match scan. Spool 7 and Spool 4 are
right outer joined using a merge join, with a join condition of (
"Spool_7.Field_1 = Spool_4.Field_1"). The result goes into Spool
8 (all_amps) (compressed columns allowed), which is built locally
on the AMPs. The size of Spool 8 is estimated with low confidence
to be 270,555 rows (5,681,655 bytes). Spool AsgnList:
"keyY" = "{LeftTable}.keyY",
"keyX" = "{RightTable}.keyX".
The estimated time for this step is 0.49 seconds.
8) We do an all-AMPs RETRIEVE step from Spool 8 (Last Use) by way of
an all-rows scan with a condition of ("Spool_8.keyY IS
NULL") into Spool 3 (all_amps) (compressed columns allowed), which
is built locally on the AMPs with Field1 ("25614"). The size of
Spool 3 is estimated with low confidence to be 270,555 rows (
5,140,545 bytes). Spool AsgnList:
"Field_1" = "25614",
"Spool_3.keyX" = "{ Copy }keyX".
The estimated time for this step is 0.01 seconds.
9) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by
way of an all-rows scan, and the grouping identifier in field 1.
Aggregate Intermediate Results are computed globally, then placed
in Spool 11. The size of Spool 11 is estimated with low
confidence to be 296 rows (6,216 bytes). The estimated time for
this step is 0.04 seconds.
10) We do an all-AMPs RETRIEVE step from Spool 11 (Last Use) by way of
an all-rows scan into Spool 1 (group_amps), which is built locally
on the AMPs with Field1 ("UniqueId"). The size of Spool 1 is
estimated with low confidence to be 296 rows (8,584 bytes). Spool
AsgnList:
"Field_1" = "UniqueId",
"Field_2" = "Field_2 ,Field_3 (INTEGER),".
The estimated time for this step is 0.01 seconds.
11) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1.
This is the new explain AFTER diagnostic is run This will run in 5secs
EXPLAIN SELECT F.keyY,
COUNT('1')
FROM DB.TB1 F LEFT OUTER JOIN DB.TB2 D
ON F.keyY = D.keyY
where (
D.keyY IS NULL )
GROUP BY F.keyY;
This query is optimized using type 2 profile insert-sel, profileid
10001.
1) First, we lock a distinct DB."pseudo table" for read on a
RowHash to prevent global deadlock for DB.D.
2) Next, we lock a distinct DB."pseudo table" for read on a
RowHash to prevent global deadlock for DB.F.
3) We lock DB.D for read, and we lock DB.F for read.
4) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from DB.D by way of
an all-rows scan with no residual conditions split into Spool
4 (all_amps) with a condition of ("keyY IN (:*)")
to qualify rows matching skewed rows of the skewed relation
(compressed columns allowed) and Spool 5 (all_amps) with
remaining rows (compressed columns allowed). Spool 4 is
duplicated on all AMPs with hash fields (
"DB.D.keyY"). Then we do a SORT to order
Spool 4 by row hash. The size of Spool 4 is estimated with
high confidence to be 180 rows. Spool 5 is built locally on
the AMPs with hash fields ("DB.D.keyY").
The size of Spool 5 is estimated with high confidence to be
307,614 rows. Spool 4 AsgnList:
"keyY" = "DB.D.keyY".
Spool 5 AsgnList:
"keyY" = "DB.D.keyY".
The estimated time for this step is 0.02 seconds.
2) We do an all-AMPs RETRIEVE step from DB.F by way of
an all-rows scan with no residual conditions split into Spool
6 (all_amps) with a condition of ("keyY IN (:*)")
to qualify skewed rows (compressed columns allowed) and Spool
7 (all_amps) with remaining rows (compressed columns allowed).
Spool 6 is built locally on the AMPs with hash fields (
"DB.F.keyY"). Then we do a SORT to
order Spool 6 by row hash. The size of Spool 6 is estimated
with high confidence to be 231,040 rows. Spool 7 is
redistributed by hash code to all AMPs with hash fields (
"DB.F.keyY"). Then we do a SORT to
order Spool 7 by row hash. The size of Spool 7 is estimated
with high confidence to be 39,515 rows. Spool 6 AsgnList:
"keyY" = "DB.F.keyY".
Spool 7 AsgnList:
"keyY" = "DB.F.keyY".
The estimated time for this step is 0.32 seconds.
5) We do an all-AMPs JOIN step (Global sum) from Spool 4 (Last Use)
by way of a RowHash match scan, which is joined to Spool 6 (Last
Use) by way of a RowHash match scan. Spool 4 and Spool 6 are
right outer joined using a merge join, with a join condition of (
"Spool_6.keyY = Spool_4.keyY"). The result
goes into Spool 8 (all_amps), which is built locally on the AMPs.
The size of Spool 8 is estimated with low confidence to be 231,040
rows (4,851,840 bytes). Spool AsgnList:
"keyY" = "{LeftTable}.keyY",
"keyY" = "{RightTable}.keyY".
The estimated time for this step is 0.01 seconds.
6) We do an all-AMPs JOIN step (Local sum) from Spool 5 (Last Use) by
way of a RowHash match scan, which is joined to Spool 7 (Last Use)
by way of a RowHash match scan. Spool 5 and Spool 7 are
right outer joined using a merge join, with a join condition of (
"Spool_7.keyY = Spool_5.keyY"). The result
goes into Spool 8 (all_amps), which is built locally on the AMPs.
The size of Spool 8 is estimated with low confidence to be 39,515
rows (829,815 bytes). Spool AsgnList:
"keyY" = "{LeftTable}.keyY",
"keyY" = "{RightTable}.keyY".
The estimated time for this step is 0.02 seconds.
7) We do an all-AMPs RETRIEVE step from Spool 8 (Last Use) by way of
an all-rows scan with a condition of ("Spool_8.keyY IS
NULL") into Spool 3 (all_amps) (compressed columns allowed), which
is built locally on the AMPs with Field1 ("26086"). The size of
Spool 3 is estimated with low confidence to be 270,555 rows (
5,140,545 bytes). Spool AsgnList:
"Field_1" = "26086",
"Spool_3.keyY" = "{ Copy }keyY".
The estimated time for this step is 0.01 seconds.
8) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by
way of an all-rows scan, and the grouping identifier in field 1.
Aggregate Intermediate Results are computed globally, then placed
in Spool 13. The size of Spool 13 is estimated with low
confidence to be 296 rows (6,216 bytes). The estimated time for
this step is 0.04 seconds.
9) We do an all-AMPs RETRIEVE step from Spool 13 (Last Use) by way of
an all-rows scan into Spool 1 (group_amps), which is built locally
on the AMPs with Field1 ("UniqueId"). The size of Spool 1 is
estimated with low confidence to be 296 rows (8,584 bytes). Spool
AsgnList:
"Field_1" = "UniqueId",
"Field_2" = "keyY ,Field_3 (INTEGER),".
The estimated time for this step is 0.01 seconds.
10) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 0.41 seconds.