23

Hi I have a table facility. Which holds a score for each day (Multiple scores can be reported each day and both would be valid)

I need to calculate the 90th percentile, SD, and Mean for score by month.

Facility:

Id   Month Date  score
1    Jan     1    5
1    Jan     1    5
1    Jan     2    3
1    Jan     3    4
1    Jan     4    4
1    Jan     5    4
1    Feb     1    5
1    Feb     1    5
1    Feb     2    3
1    Feb     3    4
1    Feb     4    4
1    Feb     5    4

Is there any way?

Thanks for your help.

Siyual
  • 16,415
  • 8
  • 44
  • 58
Agga
  • 285
  • 1
  • 2
  • 13

2 Answers2

53

You can use the new suite of analytic functions introduced in SQL Server 2012:

SELECT DISTINCT
            [Month],
            Mean   = AVG(Score) OVER (PARTITION BY [Month]),
            StdDev = STDEV(Score) OVER (PARTITION BY [Month]),
            P90    = PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY Score) OVER (PARTITION BY [Month])
FROM        my_table

There are 2 percentile functions: PERCENTILE_CONT for continuous distribution and PERCENTILE_DISC for discrete distribution. Picks one that suits your needs.

Code Different
  • 90,614
  • 16
  • 144
  • 163
  • 3
    Oooh, `PERCENTILE_CONT` - that's a new one to me, and looks like it is worth learning. Nice one - earns my upvote. – LDMJoe Nov 17 '15 at 20:18
  • 1
    I am getting error: The PERCENTILE_CONT function is not allowed in the current compatibility mode. It is only allowed in 110 mode or higher. – Agga Nov 17 '15 at 21:48
  • So you are running SQL Server 2012 but in an older mode? Check [this page](https://msdn.microsoft.com/en-CA/library/bb510680.aspx) to see how to change it – Code Different Nov 17 '15 at 21:50
  • It worked.. I had permissions as data reader as part of a group, but when our Admin added me as individual user I was able to run it. I can't tell you how much it helped me. Thanks once again. – Agga Nov 18 '15 at 00:30
3

Here's the setup...

CREATE TABLE Facility (Id INT NOT NULL, Month nvarchar(3) NOT NULL, Date INT NOT NULL, score INT NOT NULL)

INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 2, 3)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 3, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 4, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Jan', 5, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 1, 5)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 2, 3)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 3, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 4, 4)
INSERT INTO Facility (Id, Month, Date, score) VALUES (1, 'Feb', 5, 4)

Now, Standard Deviation and Mean are straight forward enough - there are built in aggregate functions for them...

SELECT
    [Month],
    AVG(CONVERT(real, score)) AS [Mean],
    STDEV(score) AS [Standard Deviation]
FROM
    Facility
GROUP BY
    [Month]

For your 90th percentile, you'll need to invent a function...

CREATE FUNCTION NintythPercentile(@Month nvarchar(3)) RETURNS INT AS
BEGIN
    DECLARE @ReturnValue INT

    SELECT 
        @ReturnValue = MIN(DerivedTopTenPercent.score) --AS [90th Percentile]
    FROM
        (
        SELECT TOP 10 PERCENT
            score
        FROM
            Facility
        WHERE
            [Month] = @Month
        ORDER BY
            score DESC
        ) DerivedTopTenPercent

    RETURN @ReturnValue
END

With that function in place, your final query will look like this...

SELECT
    [Month],
    AVG(CONVERT(real, score)) AS [Mean],
    STDEV(score) AS [Standard Deviation],
    dbo.NintythPercentile([Month]) AS [90th Percentile]
FROM
    Facility
GROUP BY
    [Month]
LDMJoe
  • 1,591
  • 13
  • 17