4

According to this:

Selectivity is the value between 0 and 1, and it is the fraction of rows returned after applying a filter on the table. For example if a table has 10,000 rows and the query returns 2601 rows, the selectivity would be 2601/10000 or .26 or 26 percent. Selectivity enables you (or optimizer for that matter) to decide which data access method is optimum in the execution plan.

I need some clarifications: ok, that table has 10000 rows and only 2601 are returned by the query. But what if, that query block contains three joined tables, or it contains a subquery in the where clause? So in the FROM clause there are three tables, and the fourth table is part of a where clause subquery, then how is this selectivity calculated?

Selectivity = number of rows satisfying a condition (from which table?) / total number of rows (from all the four tables?)

Same question for cardinality (cardinality = selectivity * total number of rows).

I found many articles about this, but each of them exemplifies these concepts with simple select statements, based on a single table or a single where clause condition.

Can someone give me an example of how are these measures calculated in case of a bit more complex query (on "hr" schema, or other training purpose schema), meaning subqueries in the FROM clause, or WHERE clause, and so on?

Thank you.

EDIT: I need some clarification about the selectivity measure, computed by the Estimator (Cost-Based Optimizer). http://gerardnico.com/wiki/database/oracle/selectivity For example, for an equality predicate (last_name = 'Smith'), selectivity is set to the reciprocal of the number n of distinct values of last_name, because the query selects rows that all contain one out of n distinct values.

I don't know how to understand that "reciprocal of the number n of distinct values".

Assuming the employees table have 107 rows, and the query Code: [Select all] [Show/ hide]

select * from employees where last_name = 'Smith'

returns 2 rows, the selectivity is 2/107 = 0.01? So it's the number of rows satisfying the predicate / total number of rows. So no "distinct" involved in this equation.

Apart from this selectivity of the statement, there is also a column selectivity, which is represented by the NDV (number of distinct values in that column - which can be queried from dba_tab_col_statistics) / total number of rows (http://www.runningoracle.com/product_info.php?products_id=233). So if NDV is 103, the selectivity of the last_name column is 103/107 = 0,96.

This is what I understood.. is this correct? If I'm wrong, please correct me.

Thank you.

Roger26
  • 57
  • 1
  • 6

1 Answers1

3

Selectivity is always based on whatever criteria are being applied at that time.

Exactly what this means for a particular table depends on the join order.

Oracle will always start executing a query by selecting rows from a particular table on it's own. In this case the selectivity is straightforward as per the examples you have read. There are no join conditions to take into account at this point.

Next, it joins in a second table. Oracle makes an estimate of how many rows will satisfy both the constant conditions directly on that table only, along with any join conditions from the first table. The later is called "join selectivity".

Then, when joining the third table it estimates based on joining to the first two tables as well as any constant conditions.

This is one of the reasons that join order is so important to a plan.

WW.
  • 23,793
  • 13
  • 94
  • 121