Teradata Aggregation operation of a Billion row fact table with some 14 dimensions. SERIOUS SQL Tuning Stuff. Interested ? get TGI card from me

Question

I am trying to optimize this query which is patterned as below . There is a 1 Trillion row fact (F1) table ( supposedly fork lifted out of another propriety into TD and no one even has taken the hassle of changing the Data Model to a more TD complaint one ) being joined with another billion row fact ( F2) table as part of the query. Needless to say the query when it ran took 100 K Impact CPU and then we got throttles in place to never let it happen again . After running it gets some 3 million count ( *) which is pretty decent row elimination compared to what we had before Query run and that is why I've been breaking my head in trying to figure some way out . But I guess, I can't really get into the brains of the optimizer that well as other gurus ?..so I am throwing this open to the floor.

Sel 
DimTb1.T1C1,
(case statement on  DimTb2.T2C3) ,
a few more selects involving Dimension tables DimTb3 through DimTb8
Sum ( F1.C1)
Sum ( F1.C2)
Sum ( F1.C3)
Sum ( F1.C4) 
etc
case when F2.C1 in < 1000+ value list > "string" else "string2"
from
DimTB1 Inner Join DimTB2  on ( = condition ) Inner Join F1 on ( == Condition ) 
/* Luckily all are INNER JOINS * /
Group by Clause 

F1 has PI on ( Cy )  as F2 has PI on ( Cx ) BUT F1 ALSO has Cx column so F1 and F2 are joined on Cx .

I checked the stats - ALL Join columns and Indexes have current stats. Optimizer wants combo stats but the rule of law here is you cant take ad hoc stats in production- has to go through some admin overhead , but most are Column Combination stats for select columns not where clause ones.

What I did

I could Get F1 and F2 together in a volatile table beautifully after I got F2 redistributed with the PI of F1 ( F1 is the anaconda table. F2 is the rattle snake ). I could Even get a few dimensions tables in the VT but when I take all dimensions the query will choke on spool ( these are very SMALL dimensions - just few 100K rows to 1000's rows each ) . I also reduce dimension columns selection by creating Der. Tb's from dimensions Just before join ( little 2- 3 columns dimension tables with group by ). The problem is of the one dimenion tables when it joins the facts tables - its tears apart the query and I dont know what to do. HOW do you tune this kind of Query and what are some of the tuning strategies for facts and dimensions with AGGREGATION being performed. OK here is the complete explain plan ( actual RUN TIME explain not pre query run - and its gets stuck at step 17. I Don't know how life will be after 17 because , it wont go past or at least I haven't given in a the spool and time to let it go.

1No Confidence
Estimated Time: 0:00:00.000 Estimated Rows: 0
Actual Time: 0:00:00.310 Actual Rows: 504
First, lock 1BillonFactVw for access, we lock DimTB1 for access, we lock DimTB2 for access, we lock DimTb8 for access, we lock DimTb3 for access, we lock DimTb4 for access, we lock DimTb5 for access, we lock DimTb8 for access, we lock  DimTb9  for access, we lock DimTb6 for access and we lock  DimTb7  for access.
2No Confidence
Estimated Time: 0:00:00.000 Estimated Rows: 0
Actual Time: 0:00:01.050 Actual Rows: 504
Next, we create the table header.
3No Confidence
Estimated Time: 0:00:00.000 Estimated Rows: 0
Actual Time: 0:00:00.600 Actual Rows: 0
We create the index subtable on Spool 50652.
4No Confidence
Estimated Time: 0:00:00.000 Estimated Rows: 0
Actual Time: 0:00:00.020 Actual Rows: 0
We modify the table header Spool 50652.
5No Confidence
Estimated Time: 0:00:00.000 Estimated Rows: 0
Actual Time: 0:00:00.020 Actual Rows: 0
We create the index subtable on Spool 50652.
6No Confidence
Estimated Time: 0:00:00.000 Estimated Rows: 0
Actual Time: 0:00:00.020 Actual Rows: 0
We modify the table header Spool 50652.
7No Confidence
Estimated Time: 0:00:00.000 Estimated Rows: 0
Actual Time: 0:00:00.000 Actual Rows: 0
We create the index subtable on Spool 50652.
8No Confidence
Estimated Time: 0:00:00.000 Estimated Rows: 0
Actual Time: 0:00:00.010 Actual Rows: 0
We modify the table header Spool 50652.
9High Confidence
Estimated Time: 0:00:00.010 Estimated Rows: 70576
Actual Time: 0:00:00.180 Actual Rows: 70576
We do an All-AMPs RETRIEVE step from DimTb8 by way of an all-rows scan into Spool 12844, which is built locally on the AMPs. This step begins a parallel block of steps.
9High Confidence
Estimated Time: 0:00:00.010 Estimated Rows: 365
Actual Time: 0:00:00.180 Actual Rows: 365
We do an All-AMPs RETRIEVE step from  DimTb9  by way of an all-rows scan into Spool 12843, which is built locally on the AMPs. This step is performed in parallel.
9High Confidence
Estimated Time: 0:00:00.010 Estimated Rows: 302
Actual Time: 0:00:00.180 Actual Rows: 302
We do an All-AMPs RETRIEVE step from DimTb4 by way of an all-rows scan into Spool 12842, which is built locally on the AMPs. This step is performed in parallel.
9Low Confidence
Estimated Time: 0:00:00.010 Estimated Rows: 5035
Actual Time: 0:00:00.180 Actual Rows: 5035
We do an All-AMPs RETRIEVE step from DimTB2 by way of an all-rows scan into Spool 12841, which is built locally on the AMPs. This step is performed in parallel.
9Low Confidence
Estimated Time: 0:00:00.010 Estimated Rows: 22
Actual Time: 0:00:00.040 Actual Rows: 18
We do an All-AMPs RETRIEVE step from  DimTb7  by way of an all-rows scan into Spool 12840, which is built locally on the AMPs. This step is performed in parallel.
9No Confidence
Estimated Time: 0:00:00.010 Estimated Rows: 6
Actual Time: 0:00:00.220 Actual Rows: 46
We do an All-AMPs RETRIEVE step from DimTb3 by way of an all-rows scan into Spool 12839, which is built locally on the AMPs. This step is performed in parallel.
9Low Confidence
Estimated Time: 0:00:00.010 Estimated Rows: 1
Actual Time: 0:00:00.030 Actual Rows: 1
We do an All-AMPs RETRIEVE step from DimTb8 by way of an all-rows scan into Spool 12838, which is built locally on the AMPs. This step is performed in parallel.
9High Confidence
Estimated Time: 0:00:00.010 Estimated Rows: 37
Actual Time: 0:00:00.020 Actual Rows: 37
We do an All-AMPs RETRIEVE step from DimTb5 by way of an all-rows scan into Spool 12837, which is built locally on the AMPs. This step is performed in parallel.
9Low Confidence
Estimated Time: 0:00:00.010 Estimated Rows: 1020
Actual Time: 0:00:00.060 Actual Rows: 1020
We do an All-AMPs RETRIEVE step from DimTB1 by way of an all-rows scan into Spool 12836, which is built locally on the AMPs. This step ends a parallel block of steps.
10Low Confidence
Estimated Time: 0:00:00.010 Estimated Rows: 504
Actual Time: 0:00:00.070 Actual Rows: 504
We do an All-AMPs RETRIEVE step from Spool 12838 (Last Use) by way of an all-rows scan into Spool 12858, which is duplicated on all AMPs. This step begins a parallel block of steps.
10High Confidence
Estimated Time: 0:00:00.027 Estimated Rows: 183960
Actual Time: 0:00:00.450 Actual Rows: 183960
We do an All-AMPs RETRIEVE step from Spool 12843 (Last Use) by way of an all-rows scan into Spool 12859, which is duplicated on all AMPs. This step ends a parallel block of steps.
11No Confidence
Estimated Time: 0:00:16.395 Estimated Rows: 11673025
Actual Time: 0:01:04.010 Actual Rows: 400146688
We do an All-AMPs JOIN step from Spool 12858 (Last Use) by way of an all-rows scan, which is joined to table 1BillonFactVw. Spool 12858 and table 1BillonFactVw are joined using a product join . The result goes into Spool 12860, which is built locally on the AMPs. This step begins a parallel block of steps.
11No Confidence
Estimated Time: 0:00:00.005 Estimated Rows: 3024
Actual Time: 0:00:02.160 Actual Rows: 23184
We do an All-AMPs RETRIEVE step from Spool 12839 (Last Use) by way of an all-rows scan into Spool 12861, which is duplicated on all AMPs. This step ends a parallel block of steps.
12No Confidence
Estimated Time: 0:00:00.073 Estimated Rows: 534319
Actual Time: 0:00:13.630 Actual Rows: 117102184
We do an All-AMPs JOIN step from Spool 12859 (Last Use) by way of an all-rows scan, which is joined to Spool 12860. Spool 12859 and Spool 12860 are joined using asingle partition hash join . The result goes into Spool 12862, which is built locally on the AMPs.
13Low Confidence
Estimated Time: 0:00:00.005 Estimated Rows: 11088
Actual Time: 0:00:04.060 Actual Rows: 9072
We do an All-AMPs RETRIEVE step from Spool 12840 (Last Use) by way of an all-rows scan into Spool 12863, which is duplicated on all AMPs. This step begins a parallel block of steps.
13No Confidence
Estimated Time: 0:00:00.011 Estimated Rows: 58290
Actual Time: 0:00:08.020 Actual Rows: 101026456
We do an All-AMPs JOIN step from Spool 12861 (Last Use) by way of an all-rows scan, which is joined to Spool 12862. Spool 12861 and Spool 12862 are joined using a merge join . The result goes into Spool 12864, which is built locally on the AMPs. This step ends a parallel block of steps.
14No Confidence
Estimated Time: 0:00:00.024 Estimated Rows: 26717
Actual Time: 0:00:22.990 Actual Rows: 99885864
We do an All-AMPs JOIN step from Spool 12863 (Last Use) by way of an all-rows scan, which is joined to Spool 12864. Spool 12863 and Spool 12864 are joined using a merge join . The result goes into Spool 12865, which is redistributed by hash code to all AMPs. This step begins a parallel block of steps.
14High Confidence
Estimated Time: 0:00:00.005 Estimated Rows: 18648
Actual Time: 0:00:01.940 Actual Rows: 18648
We do an All-AMPs RETRIEVE step from Spool 12837 (Last Use) by way of an all-rows scan into Spool 12866, which is duplicated on all AMPs. This step ends a parallel block of steps.
15No Confidence
Estimated Time: 0:00:00.019 Estimated Rows: 941
Actual Time: 0:04:08.050 Actual Rows: 95316544
We do an All-AMPs JOIN step from Spool 12865 (Last Use) by way of an all-rows scan, which is joined to table DimTb6. Spool 12865 and table DimTb6 are joined using a merge join . The result goes into Spool 12867, which is built locally on the AMPs. This step begins a parallel block of steps.
15Low Confidence
Estimated Time: 0:00:00.005 Estimated Rows: 1020
Actual Time: 0:00:03.430 Actual Rows: 1020
We do an All-AMPs RETRIEVE step from Spool 12836 (Last Use) by way of an all-rows scan into Spool 12868, which is redistributed by hash code to all AMPs. This step ends a parallel block of steps.
16No Confidence
Estimated Time: 0:00:00.022 Estimated Rows: 941
Actual Time: 0:01:01.970 Actual Rows: 93115832
We do an All-AMPs JOIN step from Spool 12866 (Last Use) by way of an all-rows scan, which is joined to Spool 12867. Spool 12866 and Spool 12867 are joined using a merge join . The result goes into Spool 12869, which is redistributed by hash code to all AMPs.
17No Confidence Active Active
Estimated Time: 0:00:00.053 Estimated Rows: 474264
Actual Time:Actual Rows:
We do an All-AMPs JOIN step from Spool 12868 (Last Use) by way of an all-rows scan, which is joined to Spool 12869. Spool 12868 and Spool 12869 are joined using a merge join . The result goes into Spool 12870, which is duplicated on all AMPs.
18No Confidence
Estimated Time: 0:00:00.012 Estimated Rows: 941
Actual Time:Actual Rows:
We do an All-AMPs JOIN step from Spool 12870 (Last Use) by way of an all-rows scan, which is joined to Spool 12841. Spool 12870 and Spool 12841 are joined using a dynamic hash join . The result goes into Spool 12871, which is redistributed by hash code to all AMPs.
19High Confidence
Estimated Time: 0:00:00.036 Estimated Rows: 70576
Actual Time:Actual Rows:
We do an All-AMPs RETRIEVE step from Spool 12844 (Last Use) by way of an all-rows scan into Spool 12872, which is redistributed by hash code to all AMPs. This step begins a parallel block of steps.
19High Confidence
Estimated Time: 0:00:00.018 Estimated Rows: 152208
Actual Time:Actual Rows:
We do an All-AMPs RETRIEVE step from Spool 12842 (Last Use) by way of an all-rows scan into Spool 12873, which is duplicated on all AMPs. This step ends a parallel block of steps.
20No Confidence
Estimated Time: 0:00:00.010 Estimated Rows: 941
Actual Time:Actual Rows:
We do an All-AMPs JOIN step from Spool 12871 (Last Use) by way of an all-rows scan, which is joined to Spool 12872. Spool 12871 and Spool 12872 are joined using a merge join . The result goes into Spool 12874, which is built locally on the AMPs.
21No Confidence
Estimated Time: 0:00:00.010 Estimated Rows: 941
Actual Time:Actual Rows:
We do an All-AMPs JOIN step from Spool 12873 (Last Use) by way of an all-rows scan, which is joined to Spool 12874. Spool 12873 and Spool 12874 are joined using a merge join . The result goes into Spool 12854, which is redistributed by hash code to all AMPs.
22No Confidence
Estimated Time: 0:00:01.000 Estimated Rows: 941
Actual Time:Actual Rows:
We do a MERGE into Spool 50652 from Spool 12854.
23No Confidence
Estimated Time: 0:00:00.000 Estimated Rows: 0
Actual Time:Actual Rows:
We send out an END TRANSACTION step to all AMPs involved in processing the request.

You definitely have some estimation problem, actual being 100,000 times more than estimated, so the actual umber for step 17 might be close to 100,000,000,000. And of course all following steps show the same underestimation, ouch. There's a lot of *No Confidence* probably missing statistics or calculation leading to a loss of stats. And if you actually got a 1,000,000,000,000 row table there should be some physical data modeling. — dnoeth, Oct 13 '15 at 11:54
You have a product join in Step 11 with the join to the billion row fact table. Unfortunately you have stripped the SQL example beyond the point which we can help figure out why that may be occurring. That product join might not be a bad thing, but it's a starting point. (+@dnoeth comments) — Rob Paller, Oct 13 '15 at 11:56
I used DNPJ and momentarily got rid of the PJ step. I ABS agree.This thing is blind on stats . ALL I did was just change the table name and col name from the original Q. Everything else is the SAME. What I am doing is pulling the join into a VT and then I will do aggregation on that join. I wondered if there is a better approach where I could aggregate to eliminate rows. Dieter - you are right DM is an issue because this is a star schema fork lift from another DB. — user1874594, Oct 13 '15 at 15:27
This thing is blind on stats ... and I cannot define new stats in prod or take them temp. That is the law here. So I have to work as a law abiding DBA and there are too many laws to work with . Another thing Joining F1 and F2 eliminates 76% Rows of F1 . But NONE of the other tables will help eliminate rows of F1. IN--SPITE of this if I am getting the final query with a few million rows means the elimination is happening on a aggregation. So I am trying to understand if you have any ideas of how I can reduce the rows via aggregation before F1 is being joined to the rest of other tables. — user1874594, Oct 13 '15 at 15:32
Without the actual SQL & Explain (after `DIAGNOSTIC HELPSTATS ON FOR SESSION` and `DIAGNOSTIC VERBOSE EXPLAIN ON FOR SESSION` & DDL & `HELP STATS` there's nothing to to. Regarding "no new stats", that's actually a totally stupid law, there must be a process to implement new stats in production, otherwise your doomed... — dnoeth, Oct 14 '15 at 14:28
It is abs and like most situations- I just got here not long ago ( poor little me : ) ...nope that is not my consolation ) . Teradata Professional Services had this Forklift Data Model in place for this place. Here you cant take stats or define new stats unless you go through change control. Change control == approvers + testers + meetings . TYSM for looking at it. I am going to follow up on this questions with the verbose explain output — user1874594, Oct 14 '15 at 16:24
Hello Pundits. I managed to get the Impact CPU down from 300K + to 4K. Spool usage from 4tb to 32gb using vt that re-distrib across dimensions and then aggregates ( the vt also bunched few more tables where I found elimination had happened- esp putting the anaconda and rattle snake together ). I remember doing this long ago but spanning across other areas propriety ETL tools and big data- I keep forgetting. I wanted to know if there are any other strategies for optimizing Fct to Dim joins that span multiple tables. Thank You Learned people. — user1874594, Oct 19 '15 at 23:23

Teradata Aggregation operation of a Billion row fact table with some 14 dimensions. SERIOUS SQL Tuning Stuff. Interested ? get TGI card from me

What I did

0 Answers0