I've got the latest SQL 2012 standard edition:
Microsoft SQL Server 2012 (SP3-CU3) (KB3152635) - 11.0.6537.0 (X64) Apr 28 2016 17:57:34 Copyright (c) Microsoft Corporation Standard Edition (64-bit) on Windows NT 6.1 (Build 7601: Service Pack 1) (Hypervisor)
On a simple count based on 2 variables (month and year) of the number of users per customer where the application is a specific one the row estimate is 10x the actual result. I've freed up the cache, rebuilt indexes, made sure it's doing a index seek and tried recompiling the plan. If I input the predicates by hand, I get the expected result and plan, but if I use dateadd I get a blown version and since I have multiple counts in a view using CTE's I am looking to find a way around the massive over estimate of rows which I believe dateadd is causing.
SELECT
[Customer],
[Month],
[Year],
COUNT([Username]) AS 'RDS'
FROM [MS_CitrixUsers]
WHERE [Application] = 'RDS'
AND [Month] = DATENAME(month, dateadd(month, -1,GETDATE()))
AND [YEAR] = DATENAME(year, dateadd(month, -1,GETDATE()))
GROUP BY [MS_CitrixUsers].[Customer],
[MS_CitrixUsers].[Month],
[MS_CitrixUsers].[Year]
The count triggers a stream aggregate after the sort which contains the inflated row count.
What my view looks like:
WITH RDS1
AS
(
SELECT
[Customer],
COUNT([Username]) AS 'RDS1'
FROM [MS_CitrixUsers]
WHERE [Application] = 'RDS'
AND [Month] = @month
AND [YEAR] = @year
GROUP BY [MS_CitrixUsers].[Customer],
[MS_CitrixUsers].[Month],
[MS_CitrixUsers].[Year]
),
RDS2
AS
(
SELECT
[Customer],
[Month],
[Year],
COUNT([Username]) AS 'RDS2'
FROM [MS_CitrixUsers]
WHERE [Application] = 'RDS'
AND [Month] = @monthbefore
AND [YEAR] = @year2
GROUP BY [MS_CitrixUsers].[Customer],
[MS_CitrixUsers].[Month],
[MS_CitrixUsers].[Year]
),
...
SELECT c.Customer,
CASE WHEN NOT EXISTS (SELECT [RDS1] FROM [RDS1] t1 WHERE t1.Customer = c.Customer)
AND EXISTS (SELECT [RDS2] FROM [RDS2] t2 WHERE t2.Customer = c.Customer) THEN '99999' ELSE ([RDS1] - Coalesce([RDS2], 0)) END AS 'RDS',
CASE WHEN NOT EXISTS (SELECT [CitrixUsage1] FROM [CitrixUsage] t1 WHERE t1.Customer = c.Customer)
AND EXISTS (SELECT [CitrixUsage2] FROM [CitrixUsage2] t2 WHERE t2.Customer = c.Customer) THEN '99999' ELSE ([CitrixUsage1] - Coalesce([CitrixUsage2], 0)) END AS 'Citrix Usage',
The intended resulting view I get is a difference between 2 months of data:
[Customer] [application1] ... [applicationN]
Customer1 -1 3
We have quite a few apps, so as you can imagine it takes quite a while to return results, so I'm seeing if optimising the estimates would help.
Workaround: I found that on another answer on here, that using a temp table, sticking my variables in there, and joining my table(s) on that avoids the stream aggregate, and uses a hash match instead which estimates the correct number of rows.