Index partial values in Oracle

Question

I am creating a composite index.

create index I on TEST (A,B);

And my query is like

select * from TEST where A=:1 and B IS NOT NULL

Above query will only return few rows ( < 10) but my leading colulmn "A" is not much unique and can return half million records for a value.

Now if I run above query it does too much logical reads as this will scan all the block with values A=:1.

Is there any trick/workaround to index just partial data. For example only that data will go in index for which B IS NOT NULL. That will make my index very compact and quiker.

So for each value of A, you have lots of rows, almost all of which have B null? — Alex Poole, Jan 10 '18 at 18:06
You may consider [Bitmap Index](https://docs.oracle.com/cd/B28359_01/server.111/b28313/indexes.htm). — Wernfried Domscheit, Jan 10 '18 at 19:55

Alex Poole · Answer 1 · 2018-01-10T18:27:33.503

Is there any trick/workaround to index just partial data. For example only that data will go in index for which B IS NOT NULL.

You could do this with a function-based index. In their example for conditional uniqueness it says:

Oracle Database does not store in the index any rows where all the keys are NULL.

You aren't doing exactly the same thing but can use the same mechanism:

create index J on TEST (case when B is not null then A end, B);

Let's say your table is as simple as you showed, and has 100000 rows for the same A value, 99998 of which have B set to null and the other two have non-null values.

create table test (a number not null, b number);
insert into test values (1, 1);
insert into test values (1, 2);
insert into test select 1, null from dual connect by level < 99999;

and you create your original index and the function-based index:

create index I on TEST (A,B);
create index J on TEST (case when B is not null then A end, B);

Then you if you gather stats you can see the number of rows in each index:

select index_name, num_rows from user_indexes where index_name in ('I','J');

INDEX_NAME                       NUM_ROWS
------------------------------ ----------
I                                 1000000
J                                       2

Your query uses the smaller index, according to the execution plan:

var v1 number;
exec :v1 := 1;

set autotrace on
select * from test where a = :v1 and b is not null;

         A          B
---------- ----------
         1          1
         1          2

Explain Plan
-----------------------------------------------------------

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------
Plan hash value: 3591688522

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     2 |    10 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| TEST |     2 |    10 |     2   (0)| 00:00:01 |
|*  2 |   INDEX FULL SCAN           | J    |     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------

   1 - filter("A"=:V1)
   2 - filter("B" IS NOT NULL)

Statistics
-----------------------------------------------------------
               4  Requests to/from client
               4  consistent gets
...

If I drop the J index and repeat the query it does a full table scan and 1610 consistent gets; you might see an I index range scan or fast full scan depending on selectivity - as I only have one A value this doesn't quite match your scenario.

`INDEX FULL SCAN` will IMO not work well if there are lot of distinct values of `A`- each of them with only small numer of not null `B`s... — Marmite Bomber, Jan 10 '18 at 18:34
Thats really helpful - Though I need to test this in my Devo environment but I am just thinking "it does not need to rewrite the query and oracle is smart enough to use Index J" I hope when there are too many distinct values for column A it willl use :Index range scan". Also when you drop index "J", ideally index "I" will be used definitely based on data. — Ajay Singh, Jan 10 '18 at 18:45
Oracle should use the best index available, but it's going to be influenced by the actual data distribution, so you'll need to look at what it actually does (including today). Look at the execution plans for various scenarios and see if they are as you expect/hope. (It's using an index full scan here because there is only one A value; with more realistic data it might well do a range scan, or fast full scan, or even ignore `J` and go back to `I`, or...) — Alex Poole, Jan 10 '18 at 18:53
Looks like Oracle is not intelligent here as it's choosing Index full scan all the time after creating new index. With that plan it's taking more time now. — Ajay Singh, Jan 11 '18 at 06:57

Marmite Bomber · Answer 2 · 2018-01-10T18:15:52.947

If you can use it the bitmap index will help you, as the bitmap index stores the NULL values as well.

But note, that you don't want to use an bitmap index in case there is a transactional load on the table, particularly concurrent transactions.

My Test case

create table tt as
select 1 a, 1 b from dual union all
select 1 a, null b from dual connect by level <= 1000000;

create index tt_idx on tt(a,b);

select /*+ INDEX(tt) */ * from tt where a = 1 and b is not NULL;

Performs

   1105  consistent gets

create bitmap index tt_idx on tt(a,b);

Performs

     83  consistent gets

Alternatively you may use function based index, but you must rewrite your query.

create   index tt_idx on tt(a,case when b is not null then 1 end);

The query must be re-formulated so that the index can be used

select /*+ INDEX(tt) */ * from tt where a = 1 and case when b is not null then 1 end = 1;

Performs

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"=1 AND CASE  WHEN "B" IS NOT NULL THEN 1 END =1)

with

     5  consistent gets

This is OLTP database with high number of transaction, Bitmap index is not an option. — Ajay Singh, Jan 11 '18 at 06:58
@Ajay you may profit of FBI if you are able to rewrite the WHERE condition. See the update above. — Marmite Bomber, Jan 12 '18 at 20:01

Index partial values in Oracle

2 Answers2