Selecting both MIN and MAX From the Table is slower than expected

Question

I have a table MYTABLE with a date column SDATE which is the primary key of the table and has a unique index on it.

When I run this query:

SELECT MIN(SDATE) FROM MYTABLE

it gives answer instantly. The same happens for:

SELECT MAX(SDATE) FROM MYTABLE

But, if I query both together:

SELECT MIN(SDATE), MAX(SDATE) FROM MYTABLE

it takes much more time to execute. I analyzed the plans and found when one of min or max is queried, it uses INDEX FULL SCAN(MIN/MAX) but when both are queried at the same time, it does a FULL TABLE SCAN.

why?

Test Data:

version 11g

create table MYTABLE
(
  SDATE  DATE not null,
  CELL   VARCHAR2(10),
  data NUMBER
)
tablespace CHIPS
  pctfree 10
  pctused 40
  initrans 1
  maxtrans 255
  storage
  (
    initial 64K
    minextents 1
    maxextents unlimited
  );

alter table MYTABLE
  add constraint PK_SDATE primary key (SDATE)
  using index 
  tablespace SYSTEM
  pctfree 10
  initrans 2
  maxtrans 255
  storage
  (
    initial 64K
    minextents 1
    maxextents unlimited
  );

Load table:

declare 
  i integer;
begin
  for i in 0 .. 100000 loop
     insert into MYTABLE(sdate, cell, data)
     values(sysdate - i/24, 'T' || i, i);     
     commit;
  end loop;
end;

Gather stats:

begin
  dbms_stats.gather_table_stats(tabname => 'MYTABLE', ownname => 'SYS');
end;

Plan1:

enter image description here

Plan2:

enter image description here

My table has almost 100000 rows and stats are fresh; you can easily re-generate the problem by creating a simple table with only one or two columns and see the results for yourself. — RGO, Sep 24 '12 at 13:24
what are the costs for the queries? can you post the plans? I think the index is very fragmented. — Florin Ghita, Sep 24 '12 at 13:34
This is my small test environment so I don't care SYSTEM tablespace. but anyway the same is happening on the production table. — RGO, Sep 24 '12 at 13:40
Please don't take bad habits: (1) Don't create objects in the SYS schema, ever. (2) don't commit in a loop. — Vincent Malgrat, Sep 24 '12 at 13:54

score 12 · Accepted Answer · answered Sep 24 '12 at 13:33

12

The Index Full Scan can only visit one side of the index. When you are doing

SELECT MIN(SDATE), MAX(SDATE) FROM MYTABLE

you are requesting to visit 2 sides. Therefore, if you want both the minimum and the maximum column value, an Index Full Scan is not viable.

A more detailed analyze you can find here.

answered Sep 24 '12 at 13:33

avi

912
8
22

The link you have, provides no explanation for this behaviour. It doesn't answer why the index cannot be (or is not) used to find both the MIN and the MAX. – ypercubeᵀᴹ Sep 24 '12 at 13:46
3

+1 for providing the right answer and a link to Richard Foote's excellent blog post around this subject. – Rob van Wijk Sep 24 '12 at 14:27
@ypercube... About what do you think is all that talk in that link i gave if it does not explain why the Index Full Scan does not work for that query? and tnx Rob van Wijk – avi Sep 24 '12 at 16:14
I doubt this answer is fully correct. This query: `SELECT MIN(SDATE), MIN(SDATE) FROM MYTABLE` also runs differently! Please see my updated post. – RGO Sep 24 '12 at 17:24
@RezaGoodarzi, it's possible that the unusual (and redundant) query `SELECT MIN(SDATE), MIN(SDATE) FROM MYTABLE` is somehow causing Oracle to not detect the optimisation that it does for `SELECT MIN(SDATE) FROM MYTABLE`. However, in my test (11gR2) the optimisation works fine even with redundant clauses in the select. – Jeffrey Kemp Sep 25 '12 at 02:53
You'd think it could optimize it away into 2 index full scans given there's no other columns in the query and not grouping by anything. – Davos Feb 01 '18 at 22:58

Vincent Malgrat · Answer 2 · 2012-09-24T14:22:04.627

The explain plans are different: a single MIN or MAX will produce a INDEX FULL SCAN (MIN/MAX) whereas when the two are present you will get an INDEX FULL SCAN or a FAST FULL INDEX SCAN.

To understand the difference, we have to look for a description of a FULL INDEX SCAN:

In a full index scan, the database reads the entire index in order.

In other words, if the index is on a VARCHAR2 field, Oracle will fetch the first block of the index that would contain for example all entries that start with the letter "A" and will read block by block all entries alphabetically until the last entry ("A" to "Z"). Oracle can process in this way because the entries are sorted in a binary tree index.

When you see INDEX FULL SCAN (MIN/MAX) in an explain plan, that is the result of an optimization that uses the fact that since the entries are sorted, you can stop after having read the first one if you are only interested by the MIN. If you are interested in the MAX only, Oracle can use the same access path but this time starting by the last entry and reading backwards from "Z" to "A".

As of now, a FULL INDEX SCAN has only one direction (either forward or backward) and can not start from both ends simultaneously, this is why when you ask for both the min and the max, you get a less efficient access method.

As suggested by other answers, if the query needs critical efficiency, you could run your own optimization by searching for the min and the max in two distinct queries.

My first thought was why a full index scan? Why not a seek? But since it stops after the first value read then that makes sense, it's just one step, vs an index seek which would be multiple steps to walk the b-tree. Thanks for the explain of the explain. — Davos, Feb 01 '18 at 22:49

score 5 · Answer 3 · answered Sep 25 '12 at 14:49

5

Try not selecting both edges of the index in one query , Accessing the query in a different way like this :

select max_date, min_date
from (select max(sdate) max_date from mytable),
       (select min(sdate) min_date from mytable)

will cause the optimizer to access the index in INDEX_FULL_SCAN(MIN/MAX) in nested loops (in our case , twice).

enter image description here

answered Sep 25 '12 at 14:49

planben

680
6
20

I was gonna propose the same solution but your answer does the magic, surprising how the sql engine isn't smart enough to figure this out automatically, lol – benjaminz Jun 16 '17 at 14:30

score 2 · Answer 4 · edited Jun 20 '20 at 09:12

I have to say that I do not see the same behaviour in 11.2

If I set up a test case as follows and updated from 10k to 1m rows in response to Vincent's comment

set linesize 130
set pagesize 0
create table mytable ( sdate date );

Table created.

insert into mytable
 select sysdate - level
   from dual
connect by level <= 1000000;
commit;

1000000 rows created.


Commit complete.

alter table mytable add constraint pk_mytable primary key ( sdate ) using index;

Table altered.

begin
dbms_stats.gather_table_stats( user, 'MYTABLE' 
                             , estimate_percent => 100
                             , cascade => true
                               );
end;
/

PL/SQL procedure successfully completed.

Then, executing your queries I get almost identical looking explain plans (notice the different types of INDEX FULL SCAN)

explain plan for select min(sdate) from mytable;

Explained.

select * from table(dbms_xplan.display);
Plan hash value: 3877058912

-----------------------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |        |     1 |     8 |     1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE        |        |     1 |     8 |        |      |
|   2 |   INDEX FULL SCAN (MIN/MAX)| PK_MYTABLE |     1 |     8 |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

9 rows selected.

explain plan for select min(sdate), max(sdate) from mytable;

Explained.

select * from table(dbms_xplan.display);
Plan hash value: 3812733167

-------------------------------------------------------------------------------
| Id  | Operation    | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |     1 |     8 |   252   (0)| 00:00:04 |
|   1 |  SORT AGGREGATE  |        |     1 |     8 |        |          |
|   2 |   INDEX FULL SCAN| PK_MYTABLE |  1000K|  7812K|   252   (0)| 00:00:04 |
-------------------------------------------------------------------------------

9 rows selected.

To quote from a previous answer of mine:

The two most common reasons for a query not using indexes are:

It's quicker to do a full table scan.

Poor statistics.

Unless there's something you're not posting in the question my immediate answer would be that you have not collected statistics on this table, you haven't collected them with a high enough estimate percent or you've used analyze, which will not help the Cost Based Optimizer, unlike dbms_stats.gather_table_stats.

To quote from the documentation on analyze:

For the collection of most statistics, use the DBMS_STATS package, which lets you collect statistics in parallel, collect global statistics for partitioned objects, and fine tune your statistics collection in other ways. See Oracle Database PL/SQL Packages and Types Reference for more information on the DBMS_STATS package.

Use the ANALYZE statement (rather than DBMS_STATS) for statistics collection not related to the cost-based optimizer:

I've just seen you comment stating 100k rows but re-doing this with that amount makes no difference. — Ben, Sep 24 '12 at 13:29
I provided data in my post. I also did it myself again and got the same results with 11.1. — RGO, Sep 24 '12 at 13:43
10k is puny :) try with 1M rows, you should see a difference — Vincent Malgrat, Sep 24 '12 at 13:47
Also the difference in explain plan is a very big one: the `INDEX FULL SCAN (MIN/MAX)` will stop after the first row fetched (column rows=1) whereas the `INDEX FULL SCAN` will read all index blocks (column rows=10000) that's 10000 times more work ! (nearly :) — Vincent Malgrat, Sep 24 '12 at 14:02
@VincentMalgrat, it looks like I'm to be doubted my entire life :-). I've updated it to 1m rows. — Ben, Sep 24 '12 at 14:03
I was trying to draw attention to what was actually written rather than the output; I've updated it to "almost identical _looking_"... @VincentMalgrat. — Ben, Sep 24 '12 at 14:09
@VincentMalgrat `INDEX FULL SCAN (MIN/MAX)` is not a really **full** index scan? — Florin Ghita, Sep 24 '12 at 14:12
@FlorinGhita it's an optimized version of the `INDEX FULL SCAN` that will stop the full scan after the first entry that is found to answer the WHERE condition, in this case the first row. I can see how the name can be misleading :) — Vincent Malgrat, Sep 24 '12 at 14:24
@Ben they look *almost* the same :) and in fact I agree that the timing difference looks smallish (0.01 to 0.04) but the amount of work must vary a lot I suspect (can you run both statments with AUTOTRACE TRACEONLY?) — Vincent Malgrat, Sep 24 '12 at 14:34

Selecting both MIN and MAX From the Table is slower than expected

4 Answers4

Linked