Query runs slow in one dataset, but faster in another bigger dataset; MSSQL server; Different execution plan. Why?

Question

Here is the query:

select nd1.ansos_id
from nursdate nd1
where nd1.division_id = 2
  and nd1.unit_id = 19
  and nd1.nursdate_start_date =
          (select min(nd2.nursdate_start_date)
           from nursdate nd2
           where nd2.ansos_id = nd1.ansos_id
                 and nd2.nursdate_start_date >= all
                      (select nd3.nursdate_start_date
                       from nursdate nd3
                       where nd3.ansos_id = nd1.ANSOS_ID
                             and nd3.nursdate_start_date <= '2017-08-13 00:00:00'))

Below is true for both datasets

Statistics up to date;
Indices defragmented
General Properties for both datasets are the same: ex: Collation;
Also the 2 datasets have about the same amount of data. Actually the faster one has more data

Now it takes ~8 seconds to run in 1 dataset, but <1 second in another dataset. Here are the execution plan difference between the 2 datasets

"Actual number of Rows" & "Number of executions" in the slower dataset is astronomically higher
The faster one also has an additional node "Index Spool"

Screenshot #1: Same query, slow execution plan in one dataset

Screenshot #2: Same query, speedy execution plan in another dataset

How to address this? What can I do to get it to run fast in the 1st dataset? Thanks!

[EDIT] Slower Execution plan: (note the entire "nursdate" table has only 99K rows) https://www.brentozar.com/pastetheplan/?id=r1ZFFuNt-

Faster Execution plan: (note the entire "nursdate" table has 333K rows. It is somehow faster) https://www.brentozar.com/pastetheplan/?id=rJYMc_EKb

[EDIT] Here is some info on the data volume. The one on "mmt" has less data but runs slower

--mmt cnt: 99347
select count(*)
from mmt_sqa_v60.mmtsqav60.nursdate nd1

--heo cnt: 333275
select count(*)
from heo_sqa_v60_2k12.heosqav602k12.nursdate nd1

--mmt cnt: 2403
select count(*)
from mmt_sqa_v60.mmtsqav60.nursdate nd1
where nd1.division_id = 2
and nd1.unit_id = 19
and nd1.nursdate_start_date <= '2017-08-13 00:00:00'


--heo cnt: 5537
select count(*)
from heo_sqa_v60_2k12.heosqav602k12.nursdate nd1
where nd1.division_id = 1
and nd1.unit_id = 20
and nd1.nursdate_start_date <= '2017-08-13 00:00:00'

Please use this tool to post both execution plans: https://www.brentozar.com/pastetheplan/ — Eli, Aug 30 '17 at 16:14
What you mean are two dataset? If they use different tables shouldnt you have two different queries? If query use different parameters then we also need two queries so we can compare. — Juan Carlos Oropeza, Aug 30 '17 at 16:17
The answer to your question is right here ["Actual number of Rows" & "Number of executions" in the slower dataset is astronomically higher]. 4.1 million rows and 9k rows. The execution plans are likely going to be different because we are talking about magnitudes of difference in data volume. — Sean Lange, Aug 30 '17 at 16:20
Your pictures indicate you are using two different server instances and two different versions. **NEVER** assume that the databases are the same unless you have take steps to verify that they are (and there is only one way to do that). In addition, you need to understand the differences in hardware and how they impact the server instance. Are your server instances configered the same as well? They also indicate that you are using the RTM version of sql server - so stop wasting your time and update to the current SP. — SMor, Aug 30 '17 at 16:21
@SeanLange Im confuse. Because OP in the title say the larger dataset is the fastest, but then in the picture say the bigger dataset is slower. And as you say should be the slower because one has 4 millions rows and the other only 900 rows. — Juan Carlos Oropeza, Aug 30 '17 at 16:34
@JuanCarlosOropeza the OP contradicts themself to be fair. At one point they state the faster one has more data but then state that the actual number of rows is astronomically higher in the slower one. — Sean Lange, Aug 30 '17 at 17:07
@SeanLange I have added some data volume info. I do not understand where the 4 million rows coming from in the slower execution plan -- table does not have that many rows at all — riceball, Aug 30 '17 at 17:31
@Juan Carlos Oropeza here "2 datasets"-- same schema but different data — riceball, Aug 30 '17 at 17:42
@Eli Thanks for the tip. Just posted both execution plans with the tool — riceball, Aug 30 '17 at 17:54
@riceball Still isnt clear. You should have Servers, Database, Schemas and Tables. — Juan Carlos Oropeza, Aug 30 '17 at 18:00
@Juan Carlos Oropeza Just did some tests to have both datasets on the same database server with current SP. But this issue still could be reproduced-- it sometimes has this astronomical high "Actual Number of rows" & "Number of execution". So I guess the factor here probably is not due to server instance — riceball, Aug 30 '17 at 19:21
Dont say Datasets. There is not dataset on Sql Server. Only Databases, Schema and Tables. — Juan Carlos Oropeza, Aug 30 '17 at 19:22

Juan Carlos Oropeza · Answer 1 · 2017-08-30T19:09:36.720

Not related to your problem but your query have some issues. I think your query function is find ansos_id with any record before @date correct me if I'm wrong.

The deepest level brings all the dates for ansos_id previous to a @date

select nd3.nursdate_start_date
from nursdate nd3
where nd3.ansos_id = nd1.ANSOS_ID
  and nd3.nursdate_start_date <= '2017-08-13 00:00:00')

then previous query compare against all those dates.

and nd2.nursdate_start_date >= all ( ...)

That is very inefficient, because the only date bigger or equal to all dates is the MAX(date)

Then you (select min(nd2.nursdate_start_date) when as I already explain there is only one date >= to all

I will say your query should be replace with this will return the last record before @date for each ansos_id:

WITH cte as (    
     select nd1.ansos_id, nursdate_start_date,
            RANK() OVER (PARTITION BY ansos_id
                         ORDER BY nursdate_start_date DESC) as rn
     from nursdate nd1
     where nd1.division_id = 2
       and nd1.unit_id = 19
       and nd1.nursdate_start_date <= '2017-08-13 00:00:00'
)
SELECT *
FROM cte
WHERE rn = 1

Thanks for helping out here. It is a 2 step query; Step 1: if there are any records <= the target date, return the latest one of those, Step 2: else if step #1 returns null, then return the earliest record > the target date. — riceball, Aug 30 '17 at 19:17
My query do Step1. But your query doesn't seem to do Step 2 neither? Can be repeated dates? — Juan Carlos Oropeza, Aug 30 '17 at 19:18
This query does step #2 (that is what the "min" is about). No, it is not about repeated dates. Also I tried something else (have both datasets in the same database server). I added a comment to the question. — riceball, Aug 30 '17 at 19:24

score 0 · Accepted Answer · answered Sep 06 '17 at 20:57

Mystery solved. It is a data issue. Here is where the high number of "execution plan" comes from:

select count(*)
from mmtsqav60.NURSDATE ndArea
left outer join mmtsqav60.NURSDATE ndRelated on ndRelated.ANSOS_ID = 
ndArea.ANSOS_ID
where ndArea.DIVISION_ID=2 and ndArea.UNIT_ID=19;

The result is 4157613 which is exactly how many rows the execution plan said are returned by the index seeks on nd2 and nd3. Because, the above logic is, in essence, what that nursdate query in WSM is asking the database server to do – find all records for an area and then for each such record find all records for that ansos_id. The data is unrealistic data for us, in that one single "ansos_id" has thousands of records. Thanks for all the people who helped here.

Query runs slow in one dataset, but faster in another bigger dataset; MSSQL server; Different execution plan. Why?

2 Answers2