18

My intention is to obtain a paginated resultset of customers. I am using this algorithm, from Tom:

select * from (
  select /*+ FIRST_ROWS(20) */ FIRST_NAME, ROW_NUMBER() over (order by FIRST_NAME) RN
  from CUSTOMER C
)
where RN between 1 and 20
order by RN;

I also have an index defined on the column "CUSTOMER"."FIRST_NAME":

CREATE INDEX CUSTOMER_FIRST_NAME_TEST ON CUSTOMER (FIRST_NAME ASC);

The query returns the expected resultset, but from the explain plan I notice that the index is not used:

--------------------------------------------------------------------------------------
| Id  | Operation                 | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |          | 15467 |   679K|   157   (3)| 00:00:02 |
|   1 |  SORT ORDER BY            |          | 15467 |   679K|   157   (3)| 00:00:02 |
|*  2 |   VIEW                    |          | 15467 |   679K|   155   (2)| 00:00:02 |
|*  3 |    WINDOW SORT PUSHED RANK|          | 15467 |   151K|   155   (2)| 00:00:02 |
|   4 |     TABLE ACCESS FULL     | CUSTOMER | 15467 |   151K|   154   (1)| 00:00:02 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("RN">=1 AND "RN"<=20)
   3 - filter(ROW_NUMBER() OVER ( ORDER BY "FIRST_NAME")<=20)

I am using Oracle 11g. Since I just query for the first 20 rows, ordered by the indexed column, I would expect the index to be used.

Why is the Oracle optimizer ignoring the index? I assume it's something wrong with the pagination algorithm, but I can't figure out what.

Thanks.

Bogdan Minciu
  • 255
  • 1
  • 3
  • 6

2 Answers2

27

more than likely your FIRST_NAME column is nullable.

SQL> create table customer (first_name varchar2(20), last_name varchar2(20));

Table created.

SQL> insert into customer select dbms_random.string('U', 20), dbms_random.string('U', 20) from dual connect by level <= 100000;

100000 rows created.

SQL> create index c on customer(first_name);

Index created.

SQL> explain plan for select * from (
  2    select /*+ FIRST_ROWS(20) */ FIRST_NAME, ROW_NUMBER() over (order by FIRST_NAME) RN
  3    from CUSTOMER C
  4  )
  5  where RN between 1 and 20
  6  order by RN;

Explained.

SQL> @explain ""

Plan hash value: 1474094583

----------------------------------------------------------------------------------------------
| Id  | Operation                 | Name     | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |          |   117K|  2856K|       |  1592   (1)| 00:00:20 |
|   1 |  SORT ORDER BY            |          |   117K|  2856K|  4152K|  1592   (1)| 00:00:20 |
|*  2 |   VIEW                    |          |   117K|  2856K|       |   744   (2)| 00:00:09 |
|*  3 |    WINDOW SORT PUSHED RANK|          |   117K|  1371K|  2304K|   744   (2)| 00:00:09 |
|   4 |     TABLE ACCESS FULL     | CUSTOMER |   117K|  1371K|       |   205   (1)| 00:00:03 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("RN">=1 AND "RN"<=20)
   3 - filter(ROW_NUMBER() OVER ( ORDER BY "FIRST_NAME")<=20)

Note
-----
   - dynamic sampling used for this statement (level=2)

21 rows selected.

SQL> alter table customer modify first_name not null;

Table altered.

SQL> explain plan for select * from (
  2    select /*+ FIRST_ROWS(20) */ FIRST_NAME, ROW_NUMBER() over (order by FIRST_NAME) RN
  3    from CUSTOMER C
  4  )
  5  where RN between 1 and 20
  6  order by RN;

Explained.

SQL> @explain ""

Plan hash value: 1725028138

----------------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |   117K|  2856K|       |   850   (1)| 00:00:11 |
|   1 |  SORT ORDER BY          |      |   117K|  2856K|  4152K|   850   (1)| 00:00:11 |
|*  2 |   VIEW                  |      |   117K|  2856K|       |     2   (0)| 00:00:01 |
|*  3 |    WINDOW NOSORT STOPKEY|      |   117K|  1371K|       |     2   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN     | C    |   117K|  1371K|       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("RN">=1 AND "RN"<=20)
   3 - filter(ROW_NUMBER() OVER ( ORDER BY "FIRST_NAME")<=20)

Note
-----
   - dynamic sampling used for this statement (level=2)

21 rows selected.

SQL>

add a NOT NULL in there to resolve it.

SQL> explain plan for select * from (
  2    select /*+ FIRST_ROWS(20) */ FIRST_NAME, ROW_NUMBER() over (order by FIRST_NAME) RN
  3    from CUSTOMER C
  4    where first_name is not null
  5  )
  6  where RN between 1 and 20
  7  order by RN;

Explained.

SQL> @explain ""

Plan hash value: 1725028138

----------------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |   117K|  2856K|       |   850   (1)| 00:00:11 |
|   1 |  SORT ORDER BY          |      |   117K|  2856K|  4152K|   850   (1)| 00:00:11 |
|*  2 |   VIEW                  |      |   117K|  2856K|       |     2   (0)| 00:00:01 |
|*  3 |    WINDOW NOSORT STOPKEY|      |   117K|  1371K|       |     2   (0)| 00:00:01 |
|*  4 |     INDEX FULL SCAN     | C    |   117K|  1371K|       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("RN">=1 AND "RN"<=20)
   3 - filter(ROW_NUMBER() OVER ( ORDER BY "FIRST_NAME")<=20)
   4 - filter("FIRST_NAME" IS NOT NULL)

Note
-----
   - dynamic sampling used for this statement (level=2)

22 rows selected.

SQL>
DazzaL
  • 21,638
  • 3
  • 49
  • 57
  • 1
    +1 Would be nice to know why can't Oracle not use an index for a `nullable` column. – Andomar Nov 21 '12 at 16:48
  • 4
    it can. but only if at least ONE column in the index is NOT NULLABLe. you see, in oracle, NULL keys are not in the index. – DazzaL Nov 21 '12 at 16:51
  • 7
    eg: `SQL> create index c on customer(first_name, id);` if ID is a non nullable column you also get the desired result. if you have NO columns that are non nullable (wierd) then create a function index like `create index c on customer(first_name, 0);` the literal 0 is always not null, so forces each null row into the index. – DazzaL Nov 21 '12 at 16:54
0

You're querying for more columns than first_name. The index on first_name just contains the first_name column and a reference to the table. So to retrieve the other columns, Oracle has to perform a lookup to the table itself for each row. Most databases try to avoid this if they can't guarantee a low record count.

A database is typically not smart enough to know the effects of a where clause on a row_number column. However, your hint /*+ FIRST_ROWS(20) */ might have done the trick.

Perhaps the table is really small, so that Oracle expects the table scan to be cheaper than lookups, even for just 20 rows.

Andomar
  • 232,371
  • 49
  • 380
  • 404