I faced a puzzling situation. A query had a good execution plan. But when that query was used as an inner query inside a larger query, that plan changed. I am trying to understand why it might be so.
This was on Oracle 11g. My query was:
SELECT * FROM YFS_SHIPMENT_H
WHERE SHIPMENT_KEY IN
(
SELECT DISTINCT SHIPMENT_KEY
FROM YFS_SHIPMENT_LINE_H
WHERE ORDER_HEADER_KEY = '20150113083918815889858'
OR ( ORDER_LINE_KEY IN ( '20150113084438815896336') )
);
As you can see, there is an inner query here, which is:
SELECT DISTINCT SHIPMENT_KEY
FROM YFS_SHIPMENT_LINE_H
WHERE ORDER_HEADER_KEY = '20150113083918815889858'
OR ( ORDER_LINE_KEY IN ( '20150113084438815896336') )
When I run just the inner query, I get the execution plan as:
PLAN_TABLE_OUTPUT
========================================================================================================
SQL_ID 3v82m4j5tv1k3, child number 0
=====================================
SELECT DISTINCT SHIPMENT_KEY FROM YFS_SHIPMENT_LINE_H WHERE
ORDER_HEADER_KEY = '20150113083918815889858' OR ( ORDER_LINE_KEY IN (
'20150113084438815896336') )
Plan hash value: 3691773903
========================================================================================================
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
========================================================================================================
| 0 | SELECT STATEMENT | | | | 10 (100)| |
| 1 | HASH UNIQUE | | 7 | 525 | 10 (10)| 00:00:01 |
| 2 | CONCATENATION | | | | | |
| 3 | TABLE ACCESS BY INDEX ROWID| YFS_SHIPMENT_LINE_H | 1 | 75 | 4 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | YFS_SHIPMENT_LINE_H_I4 | 1 | | 3 (0)| 00:00:01 |
|* 5 | TABLE ACCESS BY INDEX ROWID| YFS_SHIPMENT_LINE_H | 6 | 450 | 5 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | YFS_SHIPMENT_LINE_H_I6 | 6 | | 3 (0)| 00:00:01 |
========================================================================================================
Predicate Information (identified by operation id):
===================================================
4 = access("ORDER_LINE_KEY"='20150113084438815896336')
5 = filter(LNNVL("ORDER_LINE_KEY"='20150113084438815896336'))
6 = access("ORDER_HEADER_KEY"='20150113083918815889858')
The execution plan shows that the table YFS_SHIPMENT_LINE_H is accessed with two indexes YFS_SHIPMENT_LINE_H_I4 and YFS_SHIPMENT_LINE_H_I6; and then the results are concatenated. This plan seems fine and the query response time is great.
But when I run the complete query, the access path of the inner query changes as given below:
PLAN_TABLE_OUTPUT
=======================================================================================================
SQL_ID dk1bp8p9g3vzx, child number 0
=====================================
SELECT * FROM YFS_SHIPMENT_H WHERE SHIPMENT_KEY IN ( SELECT DISTINCT
SHIPMENT_KEY FROM YFS_SHIPMENT_LINE_H WHERE ORDER_HEADER_KEY =
'20150113083918815889858' OR ( ORDER_LINE_KEY IN (
'20150113084438815896336') ) )
Plan hash value: 3651083773
=======================================================================================================
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
=======================================================================================================
| 0 | SELECT STATEMENT | | | | 12593 (100)| |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 7 | 6384 | 12593 (1)| 00:02:32 |
| 3 | SORT UNIQUE | | 7 | 525 | 12587 (1)| 00:02:32 |
|* 4 | INDEX FAST FULL SCAN | YFS_SHIPMENT_LINE_H_I2 | 7 | 525 | 12587 (1)| 00:02:32 |
|* 5 | INDEX UNIQUE SCAN | YFS_SHIPMENT_H_PK | 1 | | 1 (0)| 00:00:01 |
| 6 | TABLE ACCESS BY INDEX ROWID| YFS_SHIPMENT_H | 1 | 837 | 2 (0)| 00:00:01 |
=======================================================================================================
Predicate Information (identified by operation id):
===================================================
4 = filter(("ORDER_HEADER_KEY"='20150113083918815889858' OR
"ORDER_LINE_KEY"='20150113084438815896336'))
5 = access("SHIPMENT_KEY"="SHIPMENT_KEY")
Please note that the YFS_SHIPMENT_LINE_H is now being accessed with a different index (YFS_SHIPMENT_LINE_H_I2). As it turns out, this is not a very good index and the query response time suffers.
My question is: Why would the inner query execution plan change when it is run as part of the larger query? Once the optimizer has figured out the best way to access YFS_SHIPMENT_LINE_H, why wouldn't it continue to use the same execution plan even when it is part of the larger query?
Note: I am not too concerned about what would be the correct access path or the index to use; and hence not giving all the indexes on the table here; and the cardinality of the data. My concern is about the change when executed separately versus as part of another query.
Thanks.
-- Parag