0

Consider that a table has 2 columns,

  1. column_1 has 60% distinct values and column_2 has 40% distinct values.
  2. Both has same data type, say integer

If one has to apply the indexes on this table, which column he should chose to index? why?

Santosh Kangane
  • 377
  • 1
  • 3
  • 12
  • 1
    Theory or practice? Tag dbms used. – jarlh Jan 22 '16 at 13:35
  • if you use both this collumns oftenly to join etc, then both :) – Robert Jan 22 '16 at 13:35
  • 2
    I think it has more to do with how often you reference the columns in a join or a where clause and less to do with the cardinality of the unique values. The cardinality may help decide which type of index (ie bitmap vs btree) to use but not whether or not to use one. – Hambone Jan 22 '16 at 13:39
  • Why do you have to apply an index? Is a query slow? Then look at that query? If someone just said -- put an index on this table -- ask them why. – Hogan Jan 22 '16 at 15:42

1 Answers1

2

The most important consideration to create an index is if you perform search over that field so the Datatype is irrelevant.

Now while more distinct values more efficient is the the index because you discard rows faster. If most of the rows are the same the index will be the same as a FULL SCAN

So if you have querys like

 SELECT * FROM Table WHERE fieldA  = Value;

 SELECT * FROM Table WHERE fieldB  = Value;

You want two index, one for each field

But if you have querys like

 SELECT * FROM Table WHERE fieldA  = Value
                       AND fieldB  = Value;

You want composte index with both fields (fieldA, fieldB)

Last tip is if you want return a single value on your select

 SELECT fieldB FROM Table WHERE fieldA  = Value;

You also want a composite index, because you will search the index for Value and dont have to make a lookup on the table to get the value of fieldB because is already on the index.

Juan Carlos Oropeza
  • 47,252
  • 12
  • 78
  • 118
  • you got it right "more distinct values more efficient is the the index because you discard rows faster. If most of the rows are the same the index will be the same as a FULL SCAN" – Santosh Kangane Jan 23 '16 at 05:57