0

I faced a puzzling situation. A query had a good execution plan. But when that query was used as an inner query inside a larger query, that plan changed. I am trying to understand why it might be so.

This was on Oracle 11g. My query was:

SELECT * FROM YFS_SHIPMENT_H     
WHERE  SHIPMENT_KEY IN 
    (
      SELECT DISTINCT SHIPMENT_KEY 
      FROM YFS_SHIPMENT_LINE_H 
      WHERE  ORDER_HEADER_KEY = '20150113083918815889858'  
      OR ( ORDER_LINE_KEY IN (  '20150113084438815896336') ) 
    );

As you can see, there is an inner query here, which is:

SELECT DISTINCT SHIPMENT_KEY 
FROM YFS_SHIPMENT_LINE_H 
WHERE  ORDER_HEADER_KEY = '20150113083918815889858'  
OR ( ORDER_LINE_KEY IN (  '20150113084438815896336') ) 

When I run just the inner query, I get the execution plan as:

PLAN_TABLE_OUTPUT
========================================================================================================
SQL_ID  3v82m4j5tv1k3, child number 0
=====================================
SELECT DISTINCT SHIPMENT_KEY FROM YFS_SHIPMENT_LINE_H WHERE
ORDER_HEADER_KEY = '20150113083918815889858'  OR ( ORDER_LINE_KEY IN (
'20150113084438815896336') )

Plan hash value: 3691773903

========================================================================================================
| Id  | Operation                     | Name                   | Rows  | Bytes | Cost (%CPU)| Time     |
========================================================================================================
|   0 | SELECT STATEMENT              |                        |       |       |    10 (100)|          |
|   1 |  HASH UNIQUE                  |                        |     7 |   525 |    10  (10)| 00:00:01 |
|   2 |   CONCATENATION               |                        |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID| YFS_SHIPMENT_LINE_H    |     1 |    75 |     4   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | YFS_SHIPMENT_LINE_H_I4 |     1 |       |     3   (0)| 00:00:01 |
|*  5 |    TABLE ACCESS BY INDEX ROWID| YFS_SHIPMENT_LINE_H    |     6 |   450 |     5   (0)| 00:00:01 |
|*  6 |     INDEX RANGE SCAN          | YFS_SHIPMENT_LINE_H_I6 |     6 |       |     3   (0)| 00:00:01 |
========================================================================================================

Predicate Information (identified by operation id):
===================================================

   4 = access("ORDER_LINE_KEY"='20150113084438815896336')
   5 = filter(LNNVL("ORDER_LINE_KEY"='20150113084438815896336'))
   6 = access("ORDER_HEADER_KEY"='20150113083918815889858')

The execution plan shows that the table YFS_SHIPMENT_LINE_H is accessed with two indexes YFS_SHIPMENT_LINE_H_I4 and YFS_SHIPMENT_LINE_H_I6; and then the results are concatenated. This plan seems fine and the query response time is great.

But when I run the complete query, the access path of the inner query changes as given below:

PLAN_TABLE_OUTPUT
=======================================================================================================
SQL_ID  dk1bp8p9g3vzx, child number 0
=====================================
SELECT * FROM YFS_SHIPMENT_H WHERE SHIPMENT_KEY IN ( SELECT DISTINCT
SHIPMENT_KEY FROM YFS_SHIPMENT_LINE_H WHERE ORDER_HEADER_KEY =
'20150113083918815889858' OR ( ORDER_LINE_KEY IN (
'20150113084438815896336') ) )

Plan hash value: 3651083773

=======================================================================================================
| Id  | Operation                    | Name                   | Rows  | Bytes | Cost (%CPU)| Time     |
=======================================================================================================
|   0 | SELECT STATEMENT             |                        |       |       | 12593 (100)|          |
|   1 |  NESTED LOOPS                |                        |       |       |            |          |
|   2 |   NESTED LOOPS               |                        |     7 |  6384 | 12593   (1)| 00:02:32 |
|   3 |    SORT UNIQUE               |                        |     7 |   525 | 12587   (1)| 00:02:32 |
|*  4 |     INDEX FAST FULL SCAN     | YFS_SHIPMENT_LINE_H_I2 |     7 |   525 | 12587   (1)| 00:02:32 |
|*  5 |    INDEX UNIQUE SCAN         | YFS_SHIPMENT_H_PK      |     1 |       |     1   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID| YFS_SHIPMENT_H         |     1 |   837 |     2   (0)| 00:00:01 |
=======================================================================================================

Predicate Information (identified by operation id):
===================================================

   4 = filter(("ORDER_HEADER_KEY"='20150113083918815889858' OR
              "ORDER_LINE_KEY"='20150113084438815896336'))
   5 = access("SHIPMENT_KEY"="SHIPMENT_KEY")

Please note that the YFS_SHIPMENT_LINE_H is now being accessed with a different index (YFS_SHIPMENT_LINE_H_I2). As it turns out, this is not a very good index and the query response time suffers.

My question is: Why would the inner query execution plan change when it is run as part of the larger query? Once the optimizer has figured out the best way to access YFS_SHIPMENT_LINE_H, why wouldn't it continue to use the same execution plan even when it is part of the larger query?

Note: I am not too concerned about what would be the correct access path or the index to use; and hence not giving all the indexes on the table here; and the cardinality of the data. My concern is about the change when executed separately versus as part of another query.

Thanks.

-- Parag

Boneist
  • 22,910
  • 1
  • 25
  • 40
Parag Deuskar
  • 87
  • 1
  • 9

3 Answers3

0

I'm not sure why the Oracle optimizer decides to change the execution path. But, I think this is a better way to write the query:

SELECT s.*
FROM YFS_SHIPMENT_H  s   
WHERE s.SHIPMENT_KEY IN (SELECT sl.SHIPMENT_KEY 
                         FROM YFS_SHIPMENT_LINE_H sl
                         WHERE sl.ORDER_HEADER_KEY = '20150113083918815889858' 
                        ) OR 
      s.SHIPMENT_KEY IN (SELECT sl.SHIPMENT_KEY 
                         FROM YFS_SHIPMENT_LINE_H sl
                         WHERE sl.ORDER_LINE_KEY IN ('20150113084438815896336')
                        );

Notes:

  • There is no need to have SELECT DISTINCT in a subquery for IN. I'm pretty sure that Oracle ignores it, but it could add overhead.
  • Splitting the logic into two queries makes it more likely that Oracle can use indexes for the query (the best ones are on YFS_SHIPMENT_LINE_H(ORDER_HEADER_KEY, SHIPMENT_KEY) and YFS_SHIPMENT_LINE_H(ORDER_LINE_KEY, SHIPMENT_KEY)).
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • You are right, the query can indeed be done in a different way. This one happens to be in a code which I can't change. ...... Also, you make a good point about not having to use `distinct` as part of the `in` clause. – Parag Deuskar Jan 27 '17 at 04:06
0

In the first query (not used as a subquery), the base table is accessed based on the conditions in the where clause. The indexes on the two columns involved are used for accessing the rows.

In the complex query, you are doing a semi-join. The optimizer, rightly or wrongly, has decided that it is more efficient to read the rows from the shipment table first, read the shipment_key, and use the index on shipment_key in the shipment_line table to retrieve rows to see if they are a match. The where clause conditions on the shipment_line table are now just filter predicates, they are not used to decide which rows to be retrieved from the table.

If you feel the optimizer got it wrong (which is possible, although not often with relatively simple queries like this one), make sure statistics are up-to-date. What would be relevant here is the size of each table, how many rows on average have the same shipment_key in shipment_line, and the selectiveness of the conditions in the where clause in the subquery. Keep in mind that for the outer query, it is not necessary to compute the subquery in full (and very likely Oracle does not compute it in full); for each row from the shipment table, as soon as a matching row in the shipment_line table is found that satisfies the where clause, the search for that shipment_key in shipment_line stops.

One thing you can do, if you really think the optimizer got it wrong, is to see what happens if you use hints. For example, you can tell the optimizer not to use the I2 index on shipment_line (pretend it doesn't exist) - see what plan it will come up with.

  • you mentioned that in the complex query oracle is reading rows from `shipment` table first and then use the `shipment_key` to read `shipment line` table. But looking at the execution plan, I felt that oracle is reading `shipment line` first (using fast full scan on `yfs_shipment_line_h_i2`). Can you please elaborate? Thanks. – Parag Deuskar Jan 25 '17 at 06:22
0

The join on shipment_key forces the optimizer to use the most selective index, in this case, the YFS_SHIPMENT_LINE_H_I2 index. Sterling created this index for this query and it is WRONG. Drop it (or make invisible) and watch your query pick up the correct plan. If you are hesitant to drop the index since it is part of the Sterling product, use SQL Plan Management baselines.

YFS_SHIPMENT_LINE_H_I2 SHIPMENT_KEY 1 YFS_SHIPMENT_LINE_H_I2 ORDER_HEADER_KEY 2 YFS_SHIPMENT_LINE_H_I2 ORDER_RELEASE_KEY 3 YFS_SHIPMENT_LINE_H_I2 ORDER_LINE_KEY 4 YFS_SHIPMENT_LINE_H_I2 REQUESTED_TAG_NUMBER 5

  • Can you add a brief explanation about what the block of text at the bottom of your response means? – Steve Westbrook Jan 24 '17 at 16:25
  • Why do you think this index is "wrong"? Also, you talk about a "join" - you realize the query doesn't do a join, right? –  Jan 24 '17 at 17:57
  • Peter, thanks. Steve/Mathguy, Peter answered based on the knowledge of the product which is making the call - and not just limited to information provided in the question. Steve, the block of text that Peter has added at the end is the columns used by the index YFS_SHIPMENT_LINE_H_I2. I am guessing it was a tabular form to start with (Index Name, Column Name, Column Position), but somehow has lost its formatting. – Parag Deuskar Jan 27 '17 at 04:11