0

Trying to compile a "statistics Hoggers" report . All those users who hogged away CPU running statistics
On what "table.cols" ( or col1,col2 etc) , did they run stats and when they ran it.

I wrote the below report but I can see its far from real

  • It does not "Split" the CPU on a given query by some proportion of weights to table. So if in a stats operation - the most expensive CPU was on FACT.BILLION_DOLLAR table but there was also a DIMENSION.DWARF table , DIMENSION.DWARF will spuriously show up on the chart -which makes the report misleading.
    I am also trying to compile another report where I want the TOP CPU by TABLE. Its not "Strictly" poss because the CPU is for a query not object but inside a query I want to "split" the CPU in proportion ( I guess the count(*) would be 1 criteria ). So HOW do I get this done
  • It "pulls over the wrong guy"- the username against running the stats operation shows up incorrectly. Our production ID that runs stats is SWPRDUSR but the top stats user shows up as SYSPRDUSR who is the system wide prod. user and he really does'nt mess with our stuff- so I know something is amiss here.
    Here's what I am running I am running this report not system wide BUT only for my My database , cascaded


    sel a.username, s.ObjectTableName, s.objectdatabasename, --s.ObjectColumnName, cast ( s.CollectTimeStamp as date ) , CAST( SUM((((a.AmpCPUTime(DEC(18,3)))+ ZEROIFNULL(a.ParserCPUTime)) )) AS DECIMAL(18,3)) as Total_CPU from
    DBC.DBQLogtbl a join DBC.DBQLoBJTBL s on ( s.ProcID = a.ProcID and cast ( s.CollectTimeStamp as date ) = cast ( a.CollectTimeStamp as date ) ) where objectdatabasename in ( sel child
    from dbc.children where parent ='FINDB'
    group by 1 ) and ObjectType='tab' and statementType='collect statistics' group by 1,2,3,4 UNION ALL sel a.username, s.ObjectTableName, s.objectdatabasename, s.Logdate, --s.ObjectColumnName, CAST( SUM((((a.AmpCPUTime(DEC(18,3)))+ ZEROIFNULL(a.ParserCPUTime)) )) AS DECIMAL(18,3)) as Total_CPU from
    PDCRinfo.DBQLogtbl a join PDCRinfo.dbqlobjtbl_hst s on ( s.queryID = a.queryID and s.Logdate = a.Logdate )
    where objectdatabasename in ( sel child
    from dbc.children where parent ='FINDB'
    group by 1 ) and ObjectType='tab' and statementType='collect statistics' group by 1,2,3,4 order by 5 desc , 3 asc, 2 asc, 1 asc ;

user1874594
  • 2,277
  • 1
  • 25
  • 49

1 Answers1

1

In the 1st Select there's a missing join condition: s.queryID = a.queryID

Collect Stats is always single table, no need to split CPU.

dnoeth
  • 59,503
  • 4
  • 39
  • 56
  • You are right. TY ... What a bad miss. I need new pair of glasses..... or whatever is behind those...or whatever is behind ( behind those )... I saw the PI as Logdate and ProcID on the PI ( not sure why QueryID was'nt there . If queryID is there it'd be a PI - PI join with the query ID giving better distribution and part of the join )..and grabbed those...assuming there's nothing beyond those. OBJECT , LOG and SQL all 3 tables have QueryID in there. Why not keep it in the PI . Thanks for the info Dieter – user1874594 Jan 22 '16 at 16:05
  • 1
    @user1874594: The PI of the DBQL tables in dbc is only for writing the cached data efficiently as a single data block. That's why the data must be moved to a history table daily, otherwise all joins on QueryId perform really bad. – dnoeth Jan 22 '16 at 18:37
  • Hi Dieter. On a corollary to that - if I need to know tables with Max CPU Hits . How do I go about getting these - there multiple tables per Query – user1874594 Jan 24 '16 at 16:15
  • 1
    There's no way to split CPU between tables. – dnoeth Jan 24 '16 at 16:29
  • I'm thinking of using this kind of logic. If the stats table has a row count value that is less than 5 days old... go ahead and pick the value or else do a count (*) . Then use a formula that does a CPU for a queryid per row per table. So for a query. If there are 3 facts and 5 dimensions ..... and there is a high CPU... Facts get a higher wt. It is not picture perfect.... but it helps to avoid misleading CPU information ...eg A simple sys calendar table used in a high CPU query has very high CPU hits. – user1874594 Jan 24 '16 at 16:37
  • On that last part , how do I get a row count kind of logic inside a case statement. Eg. `Select tablename , case when tablename like '%bigtb%' then sel count ( '1' ) from databasename.tablename else 'dropped' end`. This is definitely not something that's going to work.... but it's just the bare bones logic that would go inside the above kind of query. I wondered how should I go about doing that. Thank you very much again – user1874594 Jan 24 '16 at 16:53
  • *A simple sys calendar table* can be the cause of high CPU usage, depending on the actual query (I've seen very stupid things on calendars). So don't care about individual tables, check for individual queries causing high resource usage. – dnoeth Jan 24 '16 at 17:20