Hint FIRST_ROWS(n) not giving optimized result for Order by clause

Question

We have around 8 million records in a table having around 50 columns, we need to see few records very quickly so we are using FIRST_ROWS(10) hint for this purpose and its working amazingly fast.

SELECT /*+ FIRST_ROWS(10) */ ABC.view_ABC.ID, ABC.view_ABC.VERSION, ABC.view_ABC.M_UUID, ABC.view_ABC.M_PROCESS_NAME FROM ABC.view_ABC

However when we put a clause of ORDER BY e.g. creationtime (which is almost a unique value for each row in that table), this query will take ages to return all columns.

SELECT /*+ FIRST_ROWS(10) */ ABC.view_ABC.ID, ABC.view_ABC.VERSION, ABC.view_ABC.M_UUID, ABC.view_ABC.M_PROCESS_NAME FROM ABC.view_ABC ORDER BY ABC.view_ABC.CREATIONTIME DESC

One thing that I noticed is; if we put a ORDER BY for some column like VERSION which has same value for multiple rows, it gives the result better.

This ORDER BY is not working efficiently for any unique column like for ID column in this table.

Another thing worth considering is; if we reduce the number of columns to be fetched e.g. 3 columns instead of 50 columns the results are somehow coming faster.

P.S. gather statistics are run on this table weekly, but data is pushed hourly. Only INSERT statement is running on this table, no DELETE or UPDATE queries are running on this table.

Also, there is a simple view created no this table, the above queries are being run on same view.

What indexes do you have on the table? What explain plan are you currently getting? I would expect that an index on `CREATIONTIME` would help _this_ query a fair bit. (P.S. your query and question (`CREATIONTIME`/`VERSION`') are inconsistent. — Ben, Aug 01 '16 at 07:55
There is a multi-column index which include CREATIONTIME field.How my query and question are inconsistent, please explain? — emarshah, Aug 03 '16 at 12:59

score 0 · Answer 1 · answered Aug 01 '16 at 09:12

Without an order by clause the optimiser can perform whatever join operations your view is hiding and start returning data as soon as it has some. The hint is changing how it accesses the underlying tables so that it, for example, does a nested loop join instead of a merge join - which would allow it to find the first matching rows quickly; but might be less efficient overall for returning all of the data. Your hint is telling the optimiser that you want it prioritise the speed of the first batch of rows returned over the speed of the entire query.

When you add the order by clause then all of the data has to be found before it can be ordered. All of the join conditions have to be met and all of the nested loops/merges etc. completed, and then the entire result set has to be sorted into the order you specified, before any rows can be returned.

If the column you're ordering by is indexed and that index is being used (or can be used) by the optimiser to identify rows in the driving table then it's possible it could be incorporating that into the sort, but you can't rely on that as the optimiser can change the plan as the data and statistics change.

You may find it useful to look at the execution plans of your various queries, with and without the hint, to see what the optimiser is doing in each case, including where in the chain of steps it is doing the sort operation, and the types of joins it is doing.

score 0 · Answer 2 · answered Aug 04 '16 at 13:18

There is a multi-column index on this column (CREATION_TIME), somehow oracle hint optimizer was not using this index.

However on same table there was another column (TERMINATION_TIME), it had an index on itself. So we use the same query but with this indexed column in ORDER BY clause.

Below is the explain plan for first query with CREATION_TIME in ORDER BY clause which is part of multi-column index.

-------------------------------------------------------------------------------------------------------------
| Id  | Operation          | Name                           | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                                |  7406K|   473M|       |   308K  (1)| 01:01:40 |
|   1 |  SORT ORDER BY     |                                |  7406K|   473M|   567M|   308K  (1)| 01:01:40 |
|   2 |   TABLE ACCESS FULL| Table_ABC                      |  7406K|   473M|       |   189K  (1)| 00:37:57 |
-------------------------------------------------------------------------------------------------------------

And this one is with TERMINATION_TIME as ORDER BY clause.

--------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                           | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                                |    10 |   670 |    10   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TABLE_ABC                      |  7406K|   473M|    10   (0)| 00:00:01 |
|   2 |   INDEX FULL SCAN DESCENDING| XGN620150305000000             |    10 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------

If you see, its a clear difference in the Cost, Rows involved, Usage of Temporary Space (which is not even used in later case) and finally the Time.

Now the query response time is much better.

Thanks.

Hint FIRST_ROWS(n) not giving optimized result for Order by clause

2 Answers2