I understand that we can rewrite percentile_cont
within Teradata as:
SELECT
part_col
,data_col
+ ((MIN(data_col) OVER (PARTITION BY part_col ORDER BY data_col ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) - data_col)
* (((COUNT(*) OVER (PARTITION BY part_col) - 1) * x) MOD 1)) AS percentile_cont
FROM tab
QUALIFY ROW_NUMBER() OVER (PARTITION BY part_col ORDER BY data_col)
= CAST((COUNT(*) OVER (PARTITION BY part_col) - 1) * x AS INT) + 1;
See this very helpful discussion for more information.
Understanding that replacing x
with 0.90
would return the 90th percentile, is there an elegant way of extending this and returning multiple percentiles in one pass?
For example, say I want to extend this example and return the 25th, 50th, and 75th percentiles in one pass? Is this possible? Seems like I would need multiple QUALIFY
statements? Similarly, if I desire multiple GROUP BY
equivalents, is this akin to passing more columns in the PARTITION BY
?
-- SQL:2008 Equivalent pseudo-code
SELECT
part_col_a
,part_col_b
,PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY order_col) AS p25
,PERCENTILE_CONT(0.50) WITHIN GROUP (ORDER BY order_col) AS p50
,PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY order_col) AS p75
FROM tab
GROUP BY
part_col_a
,part_col_b