4

I need to calculate the 90th percentile of a list of values like this:

0.0099
0.0129
0.0031
0.0219
0.2632
0.0124
0.0493
0.05
0.0433

How would I go about the calculation? I know the answer that is 0.0713,9. Any suggestions?

DECLARE @Temp TABLE(DATA float)

INSERT INTO @Temp VALUES(0.0099)
INSERT INTO @Temp VALUES(0.0129)
INSERT INTO @Temp VALUES(0.0031)
INSERT INTO @Temp VALUES(0.0219)
INSERT INTO @Temp VALUES(0.2632)
INSERT INTO @Temp VALUES(0.0124)
INSERT INTO @Temp VALUES(0.0493)
INSERT INTO @Temp VALUES(0.05) 
INSERT INTO @Temp VALUES(0.0433)

SELECT DATA
FROM @Temp
ORDER BY DATA ASC

--90th percentile 
SELECT ((
        SELECT TOP 1 DATA
        FROM   (
                SELECT  TOP 90 PERCENT DATA
                FROM    @Temp
                WHERE   DATA IS NOT NULL
                ORDER BY DATA
                ) AS A
        ORDER BY DATA DESC) + 
        (
        SELECT TOP 1 DATA
        FROM   (
                SELECT  TOP 10 PERCENT DATA
                FROM    @Temp
                WHERE   DATA IS NOT NULL
                ORDER BY DATA DESC
                ) AS A
        ORDER BY DATA ASC)) / 2.0
shA.t
  • 16,580
  • 5
  • 54
  • 111
naheiwProg
  • 99
  • 2
  • 11

3 Answers3

5

A percentile (or a centile) is a measure used in statistics indicating the value below which a given percentage of observations in a group of observations fall. For example, the 20th percentile is the value (or score) below which 20 percent of the observations may be found.

There is no standard definition of percentile


The Nearest Rank method :

SELECT DATA
FROM (
    SELECT 
        DATA, 
        COUNT(1) OVER (PARTITION BY NULL) As N, 
        ROW_NUMBER() OVER (ORDER BY DATA) AS i
    FROM @Temp) t
WHERE
    i = ROUND(N * 90.00 / 100.00, 0, 0)

The Linear Interpolation Between Closest Ranks method :

DECLARE @P real = 90.00

SELECT MAX(tt.Pv)
FROM (
    SELECT 
        (CASE 
            WHEN i = k THEN DATA
            WHEN k = 0 AND P = MIN(CASE WHEN P > @P THEN P END) OVER (PARTITION BY NULL) THEN 
                DATA + N * (@P - P) / 100 * (MIN(CASE WHEN P > @P THEN DATA END) OVER (PARTITION BY NULL) - DATA)
            ELSE 0
        END) AS Pv
    FROM (
        SELECT 
            *,
            100.00 / N * (i - 1.00 / 2.00) AS P,
            CASE
                WHEN @P < 100.00 / N * (1 - 1.00 / 2.00) THEN 1 
                WHEN @P > 100.00 / N * (N - 1.00 / 2.00) THEN N
                WHEN @P = 100.00 / N * (i - 1.00 / 2.00) THEN i
                WHEN @P > 100.00 / N * (i - 1.00 / 2.00) THEN i
                ELSE 0
            END AS k
        FROM (
            SELECT 
                DATA, 
                COUNT(*) OVER (PARTITION BY NULL) As N, 
                ROW_NUMBER() OVER (ORDER BY DATA) AS i
            FROM @Temp) ti) t
    ) tt;
Community
  • 1
  • 1
shA.t
  • 16,580
  • 5
  • 54
  • 111
1

0.0713 is not a right answer, procentile is a value from the list

    DECLARE @Temp TABLE(DATA float)

INSERT INTO @Temp VALUES(0.0099)
INSERT INTO @Temp VALUES(0.0129)
INSERT INTO @Temp VALUES(0.0031)
INSERT INTO @Temp VALUES(0.0219)
INSERT INTO @Temp VALUES(0.2632)
INSERT INTO @Temp VALUES(0.0124)
INSERT INTO @Temp VALUES(0.0493)
INSERT INTO @Temp VALUES(0.05) 
INSERT INTO @Temp VALUES(0.0433) 

select max(case when rownum*1.0/numrows <= 0.9 then DATA end) as percentile_90th
from (select DATA,
             row_number() over (order by DATA) as rownum,
             count(*) over (partition by NULL) as numrows
      from @Temp
      where DATA is not null
     ) t

FROM T-SQL: Calculating the Nth Percentile Value from column

Community
  • 1
  • 1
Andrey Davydenko
  • 341
  • 2
  • 18
0

PERCENTILE_CONT function:

DECLARE @Temp TABLE(DATA float)

INSERT INTO @Temp VALUES(0.0099)
INSERT INTO @Temp VALUES(0.0129)
INSERT INTO @Temp VALUES(0.0031)
INSERT INTO @Temp VALUES(0.0219)
INSERT INTO @Temp VALUES(0.2632)
INSERT INTO @Temp VALUES(0.0124)
INSERT INTO @Temp VALUES(0.0493)
INSERT INTO @Temp VALUES(0.05) 
INSERT INTO @Temp VALUES(0.0433)

SELECT
    PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY e.DATA) OVER () AS PERCENTILE
FROM @Temp AS e
Backs
  • 24,430
  • 5
  • 58
  • 85
  • 1
    Given that this is a window function, I'd add `DISTINCT` to the query, or `TOP 1`. Otherwise, you'll get tons of redundant identical values. – Lukas Eder Jan 25 '19 at 15:19