2

Following query is taking around 45 seconds in oracle 11g

select count(cap.ISHIGH),ms.SID,ms.NUM from CDetail cap,MData ms  
where cap.MDataID_FK=ms.MDataID_PK and trunc(cap.CREATEDTIME) between trunc(sysdate-10) and trunc(sysdate)
group by ms.SID,ms.NUM ;

explain plan :

    -------------------------------------------------------------------------------------------------------------------
    | Id  | Operation                        | Name                   | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                 |                        |   766K|    32M|       | 94421   (1)| 00:18:54 |
    |   1 |  HASH GROUP BY                   |                        |   766K|    32M|    41M| 94421   (1)| 00:18:54 |
    |*  2 |   HASH JOIN                      |                        |   766K|    32M|    21M| 85716   (1)| 00:17:09 |
    |   3 |    VIEW                          | VW_GBC_5               |   766K|    13M|       | 73348   (1)| 00:14:41 |
    |   4 |     HASH GROUP BY                |                        |   766K|    13M|    98M| 73348   (1)| 00:14:41 |
    |*  5 |      FILTER                      |                        |       |       |       |            |          |
    |   6 |       TABLE ACCESS BY INDEX ROWID| CDetail         |  3217K|    58M|       | 63738   (1)| 00:12:45 |
    |*  7 |        INDEX RANGE SCAN          | IDX_CPCTYDTLTRNCCRTDTM |  3365K|       |       | 14679   (1)| 00:02:57 |
    |   8 |    TABLE ACCESS FULL             | MData       |   871K|    22M|       |  9665   (1)| 00:01:56 |
    -------------------------------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
       2 - access("ITEM_1"="MS"."MDataID_PK")
       5 - filter(TRUNC(SYSDATE@!-10)<=TRUNC(SYSDATE@!))
       7 - access(TRUNC(INTERNAL_FUNCTION("CREATEDTIME"))>=TRUNC(SYSDATE@!-10) AND 
                  TRUNC(INTERNAL_FUNCTION("CREATEDTIME"))<=TRUNC(SYSDATE@!))

table MData contains around 900,000 rows and table CDetail contains 23,000,000 rows.

Should I introduce any new index or any other way to optimize the above query.

Edit 3. IDX_CPCTYDTLTRNCCRTDTM is a functional index on trunc(CREATEDTIME) Edit :1

explain plan :for full table scan using hint /+full(Cdetail)/

---------------------------------------------------------------------------------------------------
| Id  | Operation              | Name             | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                  |   780K|    33M|       |   160K  (2)| 00:32:01 |
|   1 |  HASH GROUP BY         |                  |   780K|    33M|    42M|   160K  (2)| 00:32:01 |
|*  2 |   HASH JOIN            |                  |   780K|    33M|    22M|   151K  (2)| 00:30:15 |
|   3 |    VIEW                | VW_GBC_5         |   780K|    13M|       |   138K  (2)| 00:27:46 |
|   4 |     HASH GROUP BY      |                  |   780K|    14M|   230M|   138K  (2)| 00:27:46 |
|*  5 |      FILTER            |                  |       |       |       |            |          |
|*  6 |       TABLE ACCESS FULL| CDetail   |  7521K|   136M|       |   120K  (2)| 00:24:02 |
|   7 |    TABLE ACCESS FULL   | MData |   890K|    22M|       |  9666   (1)| 00:01:56 |
---------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("ITEM_1"="MS"."MDataID_PK")
   5 - filter(TRUNC(SYSDATE@!-10)<=TRUNC(SYSDATE@!))
   6 - filter(TRUNC(INTERNAL_FUNCTION("CREATEDTIME"))>=TRUNC(SYSDATE@!-10) AND 
              TRUNC(INTERNAL_FUNCTION("CREATEDTIME"))<=TRUNC(SYSDATE@!))
Community
  • 1
  • 1
  • 1
    You should add indices on the join columns between `CDetail` and `MData`, namely the `MDataID_FK` and `MDataID_PK` columns. This should speed up the join. – Tim Biegeleisen Jan 30 '17 at 08:59
  • @TimBiegeleisen. Why do you think that index would help? – BobC Jan 30 '17 at 09:18
  • @BobC Well the join has to be performed, with or without subsequent aggregation, and wouldn't an index make that happen faster? – Tim Biegeleisen Jan 30 '17 at 09:22
  • @TimBiegeleisen. Look at the execution plan (now I know it's based on estimates, but we have no other diags to go on at this time); where is the time being spent? It's mostly the TABLE ACCESS BY ROWID in operation 6. This is not surprising - we are doing about 3million single block IO's. Perhaps a full scan would be more efficient? (That's a rhetorical question). I would like to see the execution with a scan of the CDETAIL table, and then see how that performs. – BobC Jan 30 '17 at 09:26
  • @BobC What does `21M` mean as compared to `63738`? Agreed, that a full table scan won't benefit from an index. – Tim Biegeleisen Jan 30 '17 at 09:34
  • @TimBiegeleisen. My point was that you were suggesting creating an index to speed up the join; but that is not the part of the query that is slow. The 21M refers to the temp space (in bytes) for the HASH JOIN. – BobC Jan 30 '17 at 09:38
  • @BobC Operation #6 appears to be the _fastest_ operation in that explain table at `12:45` while almost everything else is longer. Am I missing something here? – Tim Biegeleisen Jan 30 '17 at 09:41
  • @TimBiegeleisen. The times are cumulative. So for example, the time for the HASH JOIN at line 2 is 18:54 - 17:09. In other words, 105 seconds. Whereas the table access at line 6 is nearly 10 mins, or 588 seconds. – BobC Jan 30 '17 at 09:49

1 Answers1

0
  1. Thank you for sharing an explain plan; thats a good start. The thing with an explain plan however is that it gives you estimates, not actuals. If you can, can you get a SQL Monitor report? This will show you the actual cardinality and show you where time is being spent in the query.

  2. The date filter is expecting about 3M rows (ID's 6 an 7)? Is that accurate?

  3. What is the definition of the IDX_CPCTYDTLTRNCCRTDTM index? Does it happen to be function based?

  4. Just to validate my thinking, can you add the following hint, run the query and get the explain plan again.

    select /*+ full( cap ) */ ...

BobC
  • 4,208
  • 1
  • 12
  • 15
  • 2. Yes 3M rows are accurate. For 3 and 4 I have edited my question – Varsha Gadekar Jan 30 '17 at 10:43
  • @VarshaGadekar. How long did the query take when using the full scan? – BobC Jan 30 '17 at 13:19
  • Query took 30 seconds using full table scan – Varsha Gadekar Jan 30 '17 at 13:42
  • 1
    @VarshaGadekar. Ok, another set of questions. 1. How many rows do the final query retrieve? Are you spooling these through sqlplus? 2. Do you have resources (CPU and IO Bandwidth) to use parallelism? 3. You may want to consider partitioning the CDETAIL table, but you would have to re-write the query so as to not use the TRUNC() on the date column, which is not to difficult. – BobC Jan 30 '17 at 13:51
  • 1. Query returns 779,937 rows and I am using sql developer and not sqlplus 2. I can't use parallelism – Varsha Gadekar Jan 31 '17 at 06:25