5

I have the following query and it's proving very costly and is taking 6-8 seconds to execute. Looking at the execution plan, the cost is 79% on a SORT operation. Can I get any improvement here?

IMG

    SELECT
        A.StageName,
        C.Month,
        C.MonthName as Label,
        C.Year,
        isnull(A.Average,0) as Data
    FROM
    (   
        SELECT 
            S.StageName,
            MONTH(TimeIn) as MonthNumber, 
            DATENAME(MONTH,TimeIn) as Month, 
            YEAR(TimeIn) as Year, 
            ISNULL(AVG(DATEDIFF(mi,TimeIn,isnull(TimeOut,@TodayDate))),0) as Average
        FROM
            VisitMovement VM
        INNER JOIN Stage S on
            VM.StageID = S.StageID
        WHERE 
            (VM.TimeIn >= @StartDate AND 
            VM.TimeIn  < DATEADD (d,1,@EndDate)) AND
            (VM.TimeOut < DATEADD (d,1,@EndDate) OR VM.TimeOut IS NULL)
        GROUP BY
            S.StageNumber,
            S.StageName,
            MONTH(TimeIn), 
            DATENAME(MONTH,TimeIn), 
            YEAR(TimeIn)
    ) A
    RIGHT JOIN (select distinct Month,MonthName,Year from Calendar WHERE DATE >= @StartDate AND DATE < DATEADD (d,1,@EndDate)) C on
        A.MonthNumber = C.Month and
        A.Month = C.MonthName and
        A.Year = C.Year 
    GROUP BY
        A.StageName,
        C.Month,
        C.MonthName,
        C.Year,
        A.Average   
    ORDER BY  
        CASE WHEN @Ordering = 'asc'  THEN C.Year            END ASC, 
        CASE WHEN @Ordering = 'asc'  THEN C.Month           END ASC,
        CASE WHEN @Ordering = 'asc'  THEN A.StageName       END ASC,
        CASE WHEN @Ordering = 'desc' THEN C.Year            END DESC, 
        CASE WHEN @Ordering = 'desc' THEN C.Month           END DESC,
        CASE WHEN @Ordering = 'desc'  THEN A.StageName      END DESC
shA.t
  • 16,580
  • 5
  • 54
  • 111
Philip
  • 2,460
  • 4
  • 27
  • 52
  • The most expensive sort in your plan is happening because of the `Stream Aggregate` iterator as this iterator expects sorted inputs, and the `Stream Aggregate` iterator is there because of your `inner group by` clause. Try to get the unique rows without using the `Group by` clause, that will drop the `Stream Aggregate` and the sort will disappear too. – M.Ali Dec 20 '16 at 16:36
  • Thanks @M.Ali - that's impossible unfortunately as I need to group all the records in the table in order to calculate the overall average. – Philip Dec 20 '16 at 20:41

2 Answers2

0

Because the order by has to evaluate each row, I don't think it can make optimal use of indices. Replacing the order by with a row_number() for the default sorting and only reversing the order once should at the least prevent multiple evaluations of @Ordering for a single row.

In the pseudo code below, the original query is put in a CTE. Row_number determines the ascending sort and below the cte the order is reversed if needed:

;with cte as
(
    SELECT
        A.StageName,
        C.Month,
        ...,
        row_number() over (order by C.Year,C.Month,A.StageName) sortOrder
    FROM
       ...rest of the query, excluding the order by
)
select * --or list the columns without the sortOrder 
from cte
order by sortOrder * case @Ordering when 'desc' then -1 else 1 end 
Me.Name
  • 12,259
  • 3
  • 31
  • 48
  • Thanks for that @Me.Name - appreciate the advice. Using the cte approach however makes the query even slower unfortunately. With the CTE approach, it also puts a yellow explanation mark over the top of the Sort operation. – Philip Dec 19 '16 at 22:07
0

Although I understand you can't get rid of the GROUP BY on the different columns in the sub-query, you can make it easier for the system.

Currently you have

S.StageNumber,
S.StageName,
MONTH(TimeIn), 
DATENAME(MONTH,TimeIn), 
YEAR(TimeIn)

Which I guess is quite a lot of data to go through. Allow me to make some guesses:

S.StageNumber,  -- int, 4 bytes
S.StageName, -- string, 20 bytes
MONTH(TimeIn),  -- int, 4 bytes
DATENAME(MONTH,TimeIn),  -- string 5 bytes
YEAR(TimeIn) -- int, 4 byte

Now, there are some dependencies:

  • If you know the MONTH (number), then you also now the name of it
  • I'm guessing that StageName + StageNumber is unique and directly related to StageID. If not you may have to GROUP BY once more in the outer layer.

That would bring us to

S.StageID,  -- int, 4 bytes
MONTH(TimeIn),  -- int, 4 bytes
YEAR(TimeIn) -- int, 4 byte

This means that the sorting for the GROUP BY has to run through only 12 bytes per record instead of the 37 bytes per record it was before, and numbers sort quite a bit faster than strings anyway (e.g. due to upper/lower case, accents, etc..)

I've tried to rewrite the query accordingly (Untested!). I've also moved the fetch of Month-information to a separate temp-table, this should help the Query Optimizer a little bit.

SELECT DISTINCT Month,MonthName,Year 
  INTO #dates
  FROM Calendar 
 WHERE DATE >= @StartDate AND DATE < DATEADD (d,1,@EndDate)

CREATE UNIQUE CLUSTERED INDEX uq0_#dates ON #dates (Month,Year)

SELECT
        A.StageName,
        C.Month,
        C.MonthName as Label,
        C.Year,
        isnull(A.Average,0) as Data
    FROM 
    (   
        SELECT S.StageName,
               MonthNumber,
               Year,
               Average
          FROM (    
                  SELECT VM.StageID,
                         MONTH(TimeIn) as MonthNumber, 
                         YEAR(TimeIn) as Year, 
                         ISNULL(AVG(DATEDIFF(mi,TimeIn,isnull(TimeOut,@TodayDate))),0) as Average
                    FROM
                         VisitMovement VM
                   WHERE 
                        (VM.TimeIn >= @StartDate AND 
                        VM.TimeIn  < DATEADD (d,1,@EndDate)) AND
                        (VM.TimeOut < DATEADD (d,1,@EndDate) OR VM.TimeOut IS NULL)
                  GROUP BY
                        VM.StageID,
                        MONTH(TimeIn), 
                        YEAR(TimeIn)
               ) grp
          JOIN Stage S 
            ON S.StageID = grp.StageID
        )
    ) A
    RIGHT OUTER JOIN #dates C
    on
        A.MonthNumber = C.Month and
        -- A.Month = C.MonthName and
        A.Year = C.Year 
    ORDER BY  
        CASE WHEN @Ordering = 'asc'  THEN C.Year            END ASC, 
        CASE WHEN @Ordering = 'asc'  THEN C.Month           END ASC,
        CASE WHEN @Ordering = 'asc'  THEN A.StageName       END ASC,
        CASE WHEN @Ordering = 'desc' THEN C.Year            END DESC, 
        CASE WHEN @Ordering = 'desc' THEN C.Month           END DESC,
        CASE WHEN @Ordering = 'desc'  THEN A.StageName      END DESC

Hope this helps.

deroby
  • 5,902
  • 2
  • 19
  • 33
  • Many apologies for the delayed response @deroby. I was away over the Xmas/New Year period. Thanks for your solution and Happy New Year :) – Philip Jan 02 '17 at 08:44