0

Similar to Superset Computing 90th percentile response time, except with a SQL Server data source. SQL Server has a different syntax for its percentile functions where the variable is not specified in the main function call and I can't figure out how to input it as a Superset metric.

I have Type as a Dimension in my Superset chart. I put AVG([Time]) as a metric, I get that value for each Type row. Great. Now to do 90th percentile.

For 90th percentile, I try

PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY [Time]) OVER (PARTITION BY [Type])

That errors with

"Column 'dbo.vw_Fire_Dashboard.Time' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

I've tried removing the OVER ... at the end as I did to adapt other metrics to Superset, but then it complains that

'PERCENTILE_CONT' must have an OVER clause.

Sam Firke
  • 21,571
  • 9
  • 87
  • 105
  • This is the structure of https://learn.microsoft.com/en-us/sql/t-sql/functions/percentile-cont-transact-sql?view=sql-server-ver16 and you can see that ORDER BY clause is very mandatory – Bogdan Sahlean Feb 16 '23 at 20:48
  • Did you find an answer, Sam? – Kir Apr 05 '23 at 17:32
  • Kind of. You can't do it at at the metric level as far as I can see, but I have a promising proof-of-concept where I compute the 90th percentile using a SQL View and then use Jinja to pass the Superset filters into the query. Then the calculation updates with the rest of the dashboard. I'll post as an answer if I get it fully developed. – Sam Firke Apr 09 '23 at 16:39
  • I've posted the full example below as an answer. – Sam Firke May 03 '23 at 14:22

1 Answers1

1

This can't be done with a Superset metric with SQL Server. But you can do it with Jinja templating. Here the dashboard user can select filter values of Alarm Date and Status and they'll be passed into the SQL query before the 90th percentile is calculated.

Enable the Jinja templating feature flag, then create a virtual dataset from SQL Lab. Save it and then edit it to look like this:

SELECT DISTINCT
[Incident Grouping],
PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY [Time (Minutes)]) OVER (PARTITION BY [Incident Grouping]) AS [90th Percentile]
FROM [database_name].[dbo].[table_name
WHERE 1 = 1
AND [Alarm Date] > '{{ from_dttm }}' AND [Alarm Date] < '{{ to_dttm }}'
AND [Status] in {{ filter_values('Status')|where_in }}

Incident Grouping is the variable I want to group by and I have dashboard native filters on my Superset dashboard that control the Alarm Date and Status variables in my table.

It's fussy to create and edit. This chart won't load in edit mode because of the Jinja, it's only visible in dashboard mode. And to sync the columns or change their name after the AS I have to remove the DISTINCT and the Jinja filter lines . That's why I have the WHERE 1 = 1, so I can comment out those filter lines, sync the columns, and then uncomment them.

Sam Firke
  • 21,571
  • 9
  • 87
  • 105