0

I've got the latest SQL 2012 standard edition:

Microsoft SQL Server 2012 (SP3-CU3) (KB3152635) - 11.0.6537.0 (X64) Apr 28 2016 17:57:34 Copyright (c) Microsoft Corporation Standard Edition (64-bit) on Windows NT 6.1 (Build 7601: Service Pack 1) (Hypervisor)

On a simple count based on 2 variables (month and year) of the number of users per customer where the application is a specific one the row estimate is 10x the actual result. I've freed up the cache, rebuilt indexes, made sure it's doing a index seek and tried recompiling the plan. If I input the predicates by hand, I get the expected result and plan, but if I use dateadd I get a blown version and since I have multiple counts in a view using CTE's I am looking to find a way around the massive over estimate of rows which I believe dateadd is causing.

    SELECT
    [Customer],
    [Month],
    [Year],
    COUNT([Username]) AS 'RDS'
    FROM [MS_CitrixUsers]
    WHERE [Application] = 'RDS'
    AND [Month] = DATENAME(month, dateadd(month, -1,GETDATE()))
    AND [YEAR] = DATENAME(year, dateadd(month, -1,GETDATE()))
    GROUP BY [MS_CitrixUsers].[Customer],
    [MS_CitrixUsers].[Month],
    [MS_CitrixUsers].[Year]

The count triggers a stream aggregate after the sort which contains the inflated row count.

Plan

Stream Properties

What my view looks like:

    WITH RDS1 
    AS
    (
    SELECT
    [Customer],
    COUNT([Username]) AS 'RDS1'
    FROM [MS_CitrixUsers]
    WHERE [Application] = 'RDS'
    AND [Month] = @month 
    AND [YEAR] = @year
    GROUP BY [MS_CitrixUsers].[Customer],
    [MS_CitrixUsers].[Month],
    [MS_CitrixUsers].[Year]
    ),
    RDS2 
    AS
    (
    SELECT
    [Customer],
    [Month],
    [Year],
    COUNT([Username]) AS 'RDS2'
    FROM [MS_CitrixUsers]
    WHERE [Application] = 'RDS'
    AND [Month] = @monthbefore
    AND [YEAR] = @year2
    GROUP BY [MS_CitrixUsers].[Customer],
    [MS_CitrixUsers].[Month],
    [MS_CitrixUsers].[Year]
    ),
    ...

    SELECT c.Customer,
    CASE WHEN NOT EXISTS (SELECT [RDS1] FROM [RDS1] t1 WHERE t1.Customer =     c.Customer) 
AND EXISTS (SELECT [RDS2] FROM [RDS2] t2 WHERE t2.Customer = c.Customer) THEN '99999' ELSE ([RDS1] - Coalesce([RDS2], 0)) END AS 'RDS',
CASE WHEN NOT EXISTS (SELECT [CitrixUsage1] FROM [CitrixUsage] t1 WHERE t1.Customer = c.Customer) 
AND EXISTS (SELECT [CitrixUsage2] FROM [CitrixUsage2] t2 WHERE t2.Customer = c.Customer) THEN '99999' ELSE ([CitrixUsage1] - Coalesce([CitrixUsage2], 0)) END    AS 'Citrix Usage',

The intended resulting view I get is a difference between 2 months of data:

[Customer] [application1] ... [applicationN]

Customer1 -1 3

We have quite a few apps, so as you can imagine it takes quite a while to return results, so I'm seeing if optimising the estimates would help.

Workaround: I found that on another answer on here, that using a temp table, sticking my variables in there, and joining my table(s) on that avoids the stream aggregate, and uses a hash match instead which estimates the correct number of rows.

Rich Benner
  • 7,873
  • 9
  • 33
  • 39
Glix
  • 1
  • 2
  • You've mentioned that it's a view, are you making the temp table outside of this view? – Rich Benner Jun 24 '16 at 09:42
  • Only in the workaround for testing performance. I think I have exhausted everything around: http://sqlperformance.com/2016/04/sql-performance/surprises-dateadd – Glix Jun 24 '16 at 09:53

2 Answers2

0

You should try not to calculate in a predicate. You haven't mentioned whether this is a view or not but could you not declare the variables;

DECLARE @MonthVariable varchar(10); SET @MonthVariable = DATENAME(month, dateadd(month, -1,GETDATE()))
DECLARE @YearVariable int; SET @YearVariable = DATENAME(year, dateadd(month, -1,GETDATE()))

SELECT
    [Customer],
    [Month],
    [Year],
    COUNT([Username]) AS 'RDS'
FROM [MS_CitrixUsers]
WHERE [Application] = 'RDS'
    AND [Month] = @MonthVariable
    AND [YEAR] = @YearVariable
GROUP BY [MS_CitrixUsers].[Customer],
    [MS_CitrixUsers].[Month],
    [MS_CitrixUsers].[Year]

So it's a view, try this;

SELECT
    [Customer],
    [Month],
    [Year],
    COUNT([Username]) AS 'RDS'
FROM [MS_CitrixUsers] a
INNER JOIN (SELECT
             DATENAME(month, dateadd(month, -1,GETDATE())) CurMon
            ,DATENAME(year, dateadd(month, -1,GETDATE())) CurYear
            ) b
    ON a.[Month] = b.CurMon
    AND a.[Year] = b.CurYear
WHERE [Application] = 'RDS'
GROUP BY [MS_CitrixUsers].[Customer],
    [MS_CitrixUsers].[Month],
     [MS_CitrixUsers].[Year]
Rich Benner
  • 7,873
  • 9
  • 33
  • 39
  • Hi Rich, thanks for the quick reply. It's a view, so declarations are out I'm afraid. – Glix Jun 24 '16 at 09:36
  • @Glix ok, Try the second one I've posted – Rich Benner Jun 24 '16 at 09:41
  • Thanks Rich, tried the inner join, but it's still pumping it into a stream aggregate, even with option recompile. Guess I'm stuck with joining to a temp table and making it a SP instead. See part 4 of this: http://stackoverflow.com/questions/29117042/different-estimated-rows-on-same-index-operation – Glix Jun 24 '16 at 09:49
0

My workaround version:

CREATE TABLE #MasterDate( [Month1] varchar(50), [YEAR1] varchar(10),[Month2] varchar(50),[YEAR2] varchar(10))
INSERT INTO #MasterDate ([Month1],[YEAR1],[Month2],[YEAR2] )
    SELECT DATENAME(month, dateadd(month, -1,GETDATE())) 
    ,DATENAME(year, dateadd(month, -1,GETDATE()))
    ,DATENAME(month, dateadd(month, -2,GETDATE())) 
    ,DATENAME(year, dateadd(month, -2,GETDATE()))

;WITH RDS1 
AS
(
SELECT
[Customer]
,[MS_CitrixUsers].[Month]
,[MS_CitrixUsers].[YEAR]
,COUNT([Username]) AS 'RDS1'
FROM [MS_CitrixUsers]
INNER JOIN #MasterDate m ON [Month] = m.[Month1] and [YEAR] = m.[YEAR1]
WHERE [Application] = 'RDS'
GROUP BY [MS_CitrixUsers].[Customer],
[MS_CitrixUsers].[Month],
[MS_CitrixUsers].[Year]
),

RDS2 
AS
(
SELECT
[Customer],
[Month],
[Year],
COUNT([Username]) AS 'RDS2'
FROM [MS_CitrixUsers]
INNER JOIN #MasterDate m ON [Month] = m.[Month2] and [YEAR] = m.[YEAR2]
WHERE [Application] = 'RDS'
GROUP BY [MS_CitrixUsers].[Customer],
[MS_CitrixUsers].[Month],
[MS_CitrixUsers].[Year]
)
SELECT c.Customer,
CASE WHEN NOT EXISTS (SELECT [RDS1] FROM [RDS1] t1 WHERE t1.Customer = c.Customer) 
AND EXISTS (SELECT [RDS2] FROM [RDS2] t2 WHERE t2.Customer = c.Customer) THEN '99999' ELSE ([RDS1] - Coalesce([RDS2], 0)) END AS 'RDS'
FROM [Customers] c
INNER JOIN #MasterDate m ON [Month] = m.[Month2] and [YEAR] = m.[YEAR2]
LEFT JOIN RDS1 t1 ON t1.Customer = c.Customer
LEFT JOIN RDS2 t2 ON t2.Customer = c.Customer

DROP TABLE #MasterDate

Of course, this doesn't work as a view though, only SP. Question now is, does it solve my issue of speed, just another 50 cte's to edit.

Edit: completed my SP version of the view with a inner join to the #temp table instead of the dateadd view.

View time to complete: 37 seconds

SP: 11 seconds.

Glix
  • 1
  • 2