3

I'm trying to retrieve the last 20 rows by time for users with more than 100K rows in a table with more than 1 million records. The query performance is fine (in ms) when the user has a small number of records. But takes more than 2 minutes to fetch 20 records for users with about 10K-100K records.

Here's the query:

select * from ( select * from TABLE1                                                                                                              
         where USER_ID= 41063660  and                                                                                                                        
         COL1=0 and COL2 = 0
  order by LAST_EVENT_DATE desc) where rownum <= 20 ;

There is an index (I_LASTEVENTDT) on (USER_ID, COL1, COL2, LAST_EVENT_DATE DESC)

Here is the explain plan:

------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                               |    20 | 38960 |       | 66959   (1)| 00:13:24 |
|*  1 |  COUNT STOPKEY                 |                               |       |       |       |            |          |
|   2 |   VIEW                         |                               | 65500 |   121M|       | 66959   (1)| 00:13:24 |
|*  3 |    SORT ORDER BY STOPKEY       |                               | 65500 |    96M|   102M| 66959   (1)| 00:13:24 |
|   4 |     TABLE ACCESS BY INDEX ROWID| TABLE1                        | 65500 |    96M|       | 47280   (1)| 00:09:28 |
|*  5 |      INDEX RANGE SCAN          | I_LASTEVENTDT                 | 65500 |       |       |   309   (0)| 00:00:04 |
------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=20)
   3 - filter(ROWNUM<=20)
   5 - access("USER_ID"=41063660 AND "COL1"=0 AND
              "COL2"=0)

I've tried to follow the example given on http://use-the-index-luke.com/sql/sorting-grouping/indexed-order-by

I have tried creating a separate index on (USER_ID, COL1, COL2) and (LAST_EVENT_DT DESC) too and also tried index on (USER_ID,LAST_EVENT_DT DESC) . The performance was worse for both even though for the latter index it got rid of the sort order.

How to I get better performance from this query?

Thanks in advance.

  • The explain plan says the query took only 15 seconds for 65K rows (I hope I'm not misreading). How come you say it takes 2 minutes? Do you do more processing before and after the query? – sstan Jun 21 '16 at 22:43
  • The plan shows 13 mins, not 15 seconds and is only an estimate. It takes 2 mins+ depending on the number of rows fetched, when i run it from my application. – user2755442 Jun 21 '16 at 23:02
  • Oops. Thanks for clarifying that. Just for investigation purposes, did you try adding the `FIRST_ROWS` hint to your query? Does it change the explain plan at all? – sstan Jun 21 '16 at 23:05
  • Didnt try it..but as I understand the CBO is free to ignore hints even if I put it in the query – user2755442 Jun 21 '16 at 23:58
  • Looks like there is a big write to temp space for sorting. If you have a result set with 100K records, and only need 20 of them, 99.999% of the data is thrown away. Can you pleases show your table and index definitions in their entirety? See if this [AskTom](http://www.oracle.com/technetwork/issue-archive/2006/06-sep/o56asktom-086197.html) article is of any help. – OldProgrammer Jun 22 '16 at 00:03
  • Also, run a trace and look at tkprof output. – OldProgrammer Jun 22 '16 at 00:14
  • The way the subquery is written, it doesn't give Oracle a chance to restrict the row set before it builds the data. So Oracle should choose not to use the index in the first place. Because it has to join to the main table to get all the columns in your `SELECT *` in the subquery. Only after that busy work is done can it reduce the row set to 20 in the outer query. The query needs a rewrite so that Oracle can use an index to get the appropriate 20 rows and then join to the table to get the rest of the columns for the `SELECT *` – Peter M. Jun 22 '16 at 00:37
  • 1
    Maybe try it using the last example on this page as a guide? https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions137.htm – Peter M. Jun 22 '16 at 00:49
  • ^^ You may find that Oracle is smart enough to make that style of query fast. If that's not the case -- keep the select list of the inner query to just the date of the 20th row in the series (don't `select *`). Then use the outer query to `select *` from the table using the rest of your key and date >= result from inner query. – Peter M. Jun 22 '16 at 01:01
  • @OldProgrammer, I went through the AskTom article you've referenced while trying out optimizations. As you see I am using the same query as the query Tom suggests as the optimal top N query and it generates a similar plan with the SORT ORDER BY STOPKEY. But for some reason, the sort in my plan is happening on 65K rows instead of 20 rows. I'm trying to figure out why the count stop key shows 65K rows instead of 20. I'll try running trace and tkprof as you suggested. – user2755442 Jun 22 '16 at 01:19
  • Thanks @PeterM. The problem is I don't know before hand what that date range would be. – user2755442 Jun 22 '16 at 01:21

2 Answers2

0

First try something like:

SELECT *
FROM 
   (SELECT *, ROW_NUMBER() OVER (ORDER BY last_event_date desc) R
    FROM table1
    WHERE user_id = 41063660
    AND col1 = 0
    AND col2 = 0)
WHERE R <= 20;

If that proves not to be fast, try:

SELECT *
FROM table1,
    ( SELECT last_event_date, ROW_NUMBER() OVER (ORDER BY last_event_date desc) R
    FROM table1
    WHERE user_id = 41063660
    AND col1 = 0
    AND col2 = 0 ) sub
WHERE table1.user_id = 41063660
AND table1.col1 = 0
AND table1.col2 = 0
AND sub.R = 20
AND table1.last_event_date >= sub.last_event_date
AND ROWNUM <= 20;

There may be a more direct way to write it - I don't have an Oracle instance at my disposal to try it out.

An alternative is to implement a SQL function just to pick up the date of the 20th (or Nth) row given certain keys. Then call that SQL function in a style similar to my second example but without the subquery.

Peter M.
  • 713
  • 1
  • 5
  • 14
0

I think you should look for a Index Fast Full Scan and don't order the whole row. Also, I'm trying to get just a short amount of records (20) materialized and join them again to the main table.

with Dates as (
    select /*+ Materialize */ LAST_EVENT_DATE 
       from ( select LAST_EVENT_DATE 
                      from TABLE1                                                                                                              
                     where USER_ID= 41063660  
                        and COL1=0 
                        and COL2 = 0
                     order by LAST_EVENT_DATE desc) 
       where rownum <= 20 )
select t2.* 
  from (
    select t1.*
      from TABLE1 t1 join Dates on t1.LAST_EVENT_DATE >= Dates.LAST_EVENT_DATE 
                               and t1.USER_ID= 41063660  
                               and t1.COL1=0 
                               and t1.COL2 = 0
    order by t1.LAST_EVENT_DATE desc) as t2
where rownum <= 20;
vercelli
  • 4,717
  • 2
  • 13
  • 15