How do I define a custom metric in Apache Superset?

Question

I've been implementing superset at work, and I like it so far. However, I have such a table:

name,age,gender
John,42,M
Sally,38,F
Patricia,27,F
Steven,29,M
Amanda,51,F

I want to define a new metric against each name, counting the number of people who are younger. My data is in a MySQLdatabase, and I suppose that for one person, I could write the query thus:

SELECT COUNT(DISTINCT name) from users where users.age <= 42;

for, say, John's row. So, how do I do this continuously for the entire table?

How many records do you have and expect to have in this table? The standard approach would be a self join with <= operator on age. On a large table this would result in a huge resultset. — Shadow, Dec 21 '17 at 08:11

score 0 · Answer 1 · answered Dec 21 '17 at 16:13

Your query could look something like

select *, 
  (select count(distinct all_users.name) from users all_users where all_users.age <= users.age)
FROM users

To shadow's point - this would get quite expensive to run on a large dataset.

If that were the case, you'd probably want to try putting an index on age, or denormalize that count altogether - the tradeoff being that inserts would become slower.

How do I define a custom metric in Apache Superset?

1 Answers1