Here is the query:
select nd1.ansos_id
from nursdate nd1
where nd1.division_id = 2
and nd1.unit_id = 19
and nd1.nursdate_start_date =
(select min(nd2.nursdate_start_date)
from nursdate nd2
where nd2.ansos_id = nd1.ansos_id
and nd2.nursdate_start_date >= all
(select nd3.nursdate_start_date
from nursdate nd3
where nd3.ansos_id = nd1.ANSOS_ID
and nd3.nursdate_start_date <= '2017-08-13 00:00:00'))
Below is true for both datasets
- Statistics up to date;
- Indices defragmented
- General Properties for both datasets are the same: ex: Collation;
- Also the 2 datasets have about the same amount of data. Actually the faster one has more data
Now it takes ~8 seconds to run in 1 dataset, but <1 second in another dataset. Here are the execution plan difference between the 2 datasets
- "Actual number of Rows" & "Number of executions" in the slower dataset is astronomically higher
- The faster one also has an additional node "Index Spool"
Screenshot #1: Same query, slow execution plan in one dataset
Screenshot #2: Same query, speedy execution plan in another dataset
How to address this? What can I do to get it to run fast in the 1st dataset? Thanks!
[EDIT] Slower Execution plan: (note the entire "nursdate" table has only 99K rows) https://www.brentozar.com/pastetheplan/?id=r1ZFFuNt-
Faster Execution plan: (note the entire "nursdate" table has 333K rows. It is somehow faster) https://www.brentozar.com/pastetheplan/?id=rJYMc_EKb
[EDIT] Here is some info on the data volume. The one on "mmt" has less data but runs slower
--mmt cnt: 99347
select count(*)
from mmt_sqa_v60.mmtsqav60.nursdate nd1
--heo cnt: 333275
select count(*)
from heo_sqa_v60_2k12.heosqav602k12.nursdate nd1
--mmt cnt: 2403
select count(*)
from mmt_sqa_v60.mmtsqav60.nursdate nd1
where nd1.division_id = 2
and nd1.unit_id = 19
and nd1.nursdate_start_date <= '2017-08-13 00:00:00'
--heo cnt: 5537
select count(*)
from heo_sqa_v60_2k12.heosqav602k12.nursdate nd1
where nd1.division_id = 1
and nd1.unit_id = 20
and nd1.nursdate_start_date <= '2017-08-13 00:00:00'