2

I have a query like this...

SELECT cust_num, year,credit_cust FROM
(SELECT cust_num, year,credit_cust,
ROW_NUMBER () OVER(PARTITION BY cust_num, year ORDER BY credit_cust DESC) rnk
FROM credit_cust PARTITION (YEAR_2010)
)
WHERE rnk=1

When this query is executed, the CPU usage of the server where the DB is hosted increases by at least 35%. This really concerns when there is high load in the server as the CPU usage reaches 100% sometimes.

Requirement:

The data is like below...

Cust_num  Year    Credit_cust
  456     2010       Y
  456     2010       N
  456     2009       N
  456     2009       N

I expect the query to return only the records given below...

Cust_num  Year    Credit_cust
  456     2010       Y
  456     2009       N

Note: I have used ROW_NUMBER() as a workaround for GROUP BY clause. As, earlier, i thought its just the GROUP BY clause which causing the high CPU usage

Alternate query: (which is also having >35%CPU usage)

SELECT   cust_num, YEAR, MAX (credit_cust)
    FROM credit_cust PARTITION (year_2010)
GROUP BY cust_num, YEAR


Server : Sun OS 5.10
Database : Oracle 11g (11.2.0.2.0 -64bit)

Structure of credit_cust table is like below...

CREATE TABLE CREDIT_CUST
(
  CUST_NUM          NUMBER,
  YEAR     NUMBER,
  CREDIT_CUST  CHAR(1)
)
TABLESPACE PARTITION_01
PARTITION BY RANGE (YEAR) 
SUBPARTITION BY HASH (CUST_NUM)
(  
  PARTITION YEAR_2009 VALUES LESS THAN (2010)
    SUBPARTITIONS 16 STORE IN (PARTITION_01,PARTITION_02,PARTITION_03,PARTITION_04,PARTITION_05,PARTITION_06,
                               PARTITION_01,PARTITION_02,PARTITION_03,PARTITION_04,PARTITION_05,PARTITION_06,
                               PARTITION_01,PARTITION_02,PARTITION_03,PARTITION_04),  
  PARTITION YEAR_2010 VALUES LESS THAN (2011)
    SUBPARTITIONS 1 STORE IN (PARTITION_01))
PARALLEL ( DEGREE 16 INSTANCES 1 );

Please tell me why this issue comes up. Let me know if any more details are required.

EXPLAIN PLAN:

PLAN_TABLE_OUTPUT

Plan hash value: 3927595547

----------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                           | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                                |    29M|  1179M|       |  3937   (3)| 00:00:01 |       |       |        |      |            |
|   1 |  PX COORDINATOR                |                                |       |       |       |            |          |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)          | :TQ10001                       |    29M|  1179M|       |  3937   (3)| 00:00:01 |       |       |  Q1,01 | P->S | QC (RAND)  |
|*  3 |    VIEW                        |                                |    29M|  1179M|       |  3937   (3)| 00:00:01 |       |       |  Q1,01 | PCWP |            |
|*  4 |     WINDOW SORT PUSHED RANK    |                                |    29M|   364M|   674M|  3937   (3)| 00:00:01 |       |       |  Q1,01 | PCWP |            |
|   5 |      PX RECEIVE                |                                |    29M|   364M|       |  3937   (3)| 00:00:01 |       |       |  Q1,01 | PCWP |            |
|   6 |       PX SEND HASH             | :TQ10000                       |    29M|   364M|       |  3937   (3)| 00:00:01 |       |       |  Q1,00 | P->P | HASH       |
|*  7 |        WINDOW CHILD PUSHED RANK|                                |    29M|   364M|       |  3937   (3)| 00:00:01 |       |       |  Q1,00 | PCWP |            |
|   8 |         PX BLOCK ITERATOR      |                                |    29M|   364M|       |    34   (3)| 00:00:01 |     1 |     1 |  Q1,00 | PCWC |            |
|*  9 |          TABLE ACCESS FULL     | CREDIT_CUST                    |    29M|   364M|       |    34   (3)| 00:00:01 |   228 |   228 |  Q1,00 | PCWP |            |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("RNK"=1)
   4 - filter(ROW_NUMBER() OVER ( PARTITION BY "CUST_NUM","YEAR" ORDER BY INTERNAL_FUNCTION("CREDIT_CUST") DESC )<=1)
   7 - filter(ROW_NUMBER() OVER ( PARTITION BY "CUST_NUM","YEAR" ORDER BY INTERNAL_FUNCTION("CREDIT_CUST") DESC )<=1)
   9 - filter("YEAR"=2010)

Note
-----
   - dynamic sampling used for this statement (level=5)
Vivek
  • 4,452
  • 10
  • 27
  • 45

2 Answers2

1

I have two suggestions:

  1. Partition the table properly. You have a partition subpartionned and another not(a single subpartition). A number of years is not obvious in a partition. I would use yearly partition by Year and subpartitions by hash on cust(8 or 16 subpartition per partition. 16 if table is massive and you have a lot of processors.)

  2. your query would return result for customers in year 2007, 2008, 2009, 2010. Are you sure you want four years? If you want only 2010, you should specify year = 2010 in where clause and would not be necessary to partition by year in analitic function, wich help the sorting(wich consume CPU and TEMP space).

Query would be like:

SELECT cust_num, year,credit_cust 
FROM
  (SELECT cust_num, year,credit_cust,
   ROW_NUMBER () OVER(PARTITION BY cust_num ORDER BY credit_cust DESC) rnk
   FROM credit_cust
   WHERE YEAR=2010
   )
WHERE rnk=1;

in General, is better to chose partition with where clause instead of specifying partition name(is is also easier)

Florin Ghita
  • 17,525
  • 6
  • 57
  • 76
  • It should return result for year 2010 alone as the partition year_2010 has only data for 2010. Or may be I'm not getting your point still. This table is partitioned by year, as of now each partition (year) has 33-35 million records. And, the data from year 1993 to till date is stored in `credit_cust` table. – Vivek Jan 22 '13 at 13:51
  • as you given us the DDL, the partition YEAR_2010 contains more than one year, 2007-2010. – Florin Ghita Jan 22 '13 at 14:05
  • Oops! That's a mistake i did while removing some partition's code, just to keep the code simple. Sorry for the confusion, i have changed it now. – Vivek Jan 22 '13 at 14:30
  • In this case you should try my second advice: remove the year from the partition by clause. – Florin Ghita Jan 22 '13 at 14:36
  • I have added the requirement above. I hope now its clear why `year` is needed in `partition by` – Vivek Jan 22 '13 at 16:46
  • your PARTITION YEAR_2010 contains only year 2010 so, it is useless to partition by year. However, read my entire response again with attention. And read jonearles's answer, because he may be right. – Florin Ghita Jan 23 '13 at 09:24
1

This query is using a large amount of CPU because of parallelism. The table is set to DEGREE 16. By default, your queries will use 16 or 32 parallel servers, depending on whether or not they are sorting.

The partitioning options and query execution plans are not directly relevant here. Even if they are incorrect, a "bad" query will not usually consume a significant amount of resources per second.

There are many possible solutions, depending at what level you want to disable parallelism. Here are some common solutions:

  1. Statement level: /*+ no_parallel(credit_cust) */
  2. Session level: alter session disable parallel query;
  3. Object level: alter table cust parallel 1;
  4. System level: alter system set parallel_max_servers = 0;

But think very carefully before you change anything. Some of those solutions can be disastrous if used incorrectly. You need to have a good understanding of parallelism, the system resources, and how the system is used. As usual, a good place to start is the manual.

Jon Heller
  • 34,999
  • 6
  • 74
  • 132