3

I have a simple Oracle query with a plan that doesn't make sense.

SELECT
    u.custid AS "custid",
    l.listid AS "listid"
FROM
    users u
    INNER JOIN lists l ON u.custid = l.custid

And here’s what the autotrace explain tells me it has for a plan

------------------------------------------------------------------------------------------
| Id  | Operation             | Name     | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |          |  1468K|    29M|       | 11548   (1)| 00:00:01 |
|*  1 |  HASH JOIN            |          |  1468K|    29M|  7104K| 11548   (1)| 00:00:01 |
|   2 |   INDEX FAST FULL SCAN| USERS_PK |   404K|  2367K|       |   266   (2)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL   | LISTS    |  1416K|    20M|       |  9110   (1)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("U"."CUSTID"="L"."CUSTID")
   3 - filter(TRUNC("SORT_TYPE")>=1 AND TRUNC("SORT_TYPE")<=16)

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - this is an adaptive plan
   - 1 Sql Plan Directive used for this statement

What concerns me is predicate 3. sort_type does not appear in the query, and is not indexed at all. It seems to me that sort_type should not be involved in this query at all.

There is a constraint on lists.sort_type: (Yes, I know we could have sort_type be an INTEGER not a NUMBER)

sort_type   NUMBER DEFAULT 2 NOT NULL,
    CONSTRAINT lists_sort_type CHECK ( sort_type BETWEEN 1 AND 16 AND TRUNC(sort_type) = sort_type )

It looks to me that that filter is on sort_type is basically a tautology. Every row in lists must pass that filter because of that constraint.

If I drop the constraint, the filter no longer shows up in the plan, and the estimated cost goes down a little bit. If I add the constraint back, the plan uses the filter again. There's no significant difference in execution speed one way or the other.

I'm concerned because I discovered this filter in a much larger, more complex query that I was trying to optimize down from a couple of minutes of runtime.

Why is Oracle adding that filter, and is it a problem and/or pointing to another problem?

EDIT: If I change the constraint on sort_type to not have the TRUNC part, the filter disappears. If I split the constraint into two different constraints, the filter comes back.

Andy Lester
  • 91,102
  • 13
  • 100
  • 152
  • 1
    Hmm, I am curious about this also. Maybe try posting on asktom? – OldProgrammer Feb 08 '19 at 17:06
  • The optimizer takes as inputs - The query to optimize - All available database object statistics - System statistics, if available (CPU speed, single-block I/O speed, and so on—metrics about the physical hardware) - Initialization parameters - Constraints – Saxon Feb 08 '19 at 22:19
  • Look at this article by Tom Kyte: https://www.oracle.com/technetwork/testcontent/o39asktom-096149.html – Saxon Feb 08 '19 at 22:21

1 Answers1

3

Generally speaking, Oracle generates predicates based on your CHECK constraints whenever doing so will give the optimizer more information to generate a good plan. It is not always smart enough to recognize when those are redundant. Here is a short example in Oracle 12c using your table names:

-- Create the CUSTS table
CREATE TABLE custs ( custid number not null );
CREATE INDEX custs_u1 on custs (custid);

-- Create the LISTS table
CREATE TABLE lists 
  ( listid number not null, 
    sort_type number not null, 
    custid number,
    constraint lists_c1 check ( sort_type between 1 and 16 and
          trunc(sort_type) = sort_Type )
);

-- Explain a join
EXPLAIN PLAN FOR
SELECT /*+ USE_HASH(u) */ 
       u.custid AS "custid",
       l.listid AS "listid"
FROM custs u
INNER JOIN lists l ON u.custid = l.custid;

-- Show the plan
select * from table(dbms_xplan.display);

Plan hash value: 2443150416
-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |     1 |    39 |     3   (0)| 00:00:01 |
|*  1 |  HASH JOIN         |          |     1 |    39 |     3   (0)| 00:00:01 |
|   2 |   INDEX FULL SCAN  | CUSTS_U1 |     1 |    13 |     1   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| LISTS    |     1 |    26 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("U"."CUSTID"="L"."CUSTID")

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

So far, nothing weird. No questionable predicates added.

Now, let's tell the Oracle optimizer that the distribution of data on TRUNC(sort_type) might matter...

declare
  x varchar2(30);
begin
  x := dbms_stats.create_extended_stats ( user, 'LISTS', '(TRUNC(SORT_TYPE))');
  dbms_output.put_line('Extension name = ' || x);
end;

... and, now, let's explain that same query again...

-- Re-explain the same join as before
EXPLAIN PLAN FOR
SELECT /*+ USE_HASH(u) */ 
       u.custid AS "custid",
       l.listid AS "listid"
FROM custs u
INNER JOIN lists l ON u.custid = l.custid;

-- Show the new plan
select * from table(dbms_xplan.display);
Plan hash value: 2443150416

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |     1 |    52 |     3   (0)| 00:00:01 |
|*  1 |  HASH JOIN         |          |     1 |    52 |     3   (0)| 00:00:01 |
|   2 |   INDEX FULL SCAN  | CUSTS_U1 |     1 |    13 |     1   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| LISTS    |     1 |    39 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("U"."CUSTID"="L"."CUSTID")
   3 - filter(TRUNC("SORT_TYPE")>=1 AND TRUNC("SORT_TYPE")<=16)

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

Now, Oracle has added the predicate, because the CBO might benefit from it. Does it really benefit from it? No, but Oracle is only smart enough to know that it might and that it doesn't(*) hurt anything.

(*) there have been numerous bugs in previous versions where this _has_ hurt things by messing up the selectivity estimated by the CBO.

The presence of extended statistics is only one example reason of why Oracle might think it could benefit from this predicate. To find out if that is the reason in your case, you can look for extended statistics in your database like this:

SELECT * FROM dba_stat_extensions where table_name = 'LISTS';

Keep in mind, the Oracle CBO can create stat extensions on its own. So there could be extended stats that you didn't realize were there.

Matthew McPeak
  • 17,705
  • 2
  • 27
  • 59
  • Thanks for the in-depth answer. It just seems strange to me to add something to the plan where by definition it will not do anything. It's like adding a filter for `col IS NOT NULL` when `col` is defined as `NOT NULL` already. And we're not dealing with any weird views like the Ask Tom article that @Saxon mentioned above https://www.oracle.com/technetwork/testcontent/o39asktom-096149.html – Andy Lester Feb 11 '19 at 15:28
  • 1
    @AndyLester I agree it's weird. Part of me suspects that (A) Oracle adds these predicates before it does all its query transformation logic, since Q.T. is where it would rearrange predicates based on transitive closure and (B) having added them and then performed Q.T., it is not simple for Oracle to know which ones it really didn't need. But I don't really know. A 10053 trace on your query might prove interesting. – Matthew McPeak Feb 11 '19 at 18:09