I have to tables in Redshift, articles and clicks:
articles
| articleID | authorID |
| 100 | 2 |
| 101 | 2 |
| 102 | 6 |
| 103 | 7 |
| 104 | 2 |
clicks
|articleID | category |
|100 | "mail" |
|100 | "mail" |
|100 | "rss" |
|101 | "rss" |
|101 | "mail" |
|101 | "app" |
|101 | "app" |
Now I want to aggregate for each articleID the number of clicks per category as well as the total amount of clicks with this query
SELECT clicks.articleID,
ANY_VALUE(article.authorID) AS authorID,
COUNT(CASE WHEN clicks.category ='rss' THEN 1 END) as rss,
COUNT(CASE WHEN clicks.category ='mail' THEN 1 END) as mail,
COUNT(CASE WHEN clicks.category ='app' THEN 1 END) as app,
COUNT(clicks.articleID) AS total
FROM clicks
INNER JOIN articles
ON clicks.articleID = articles.articleID
GROUP BY clicks.articleID
It results in
| articleID | authorID | rss | mail | app | total |
| 100 | 2 | 1 | 2 | 0 | 3 |
| 101 | 2 | 1 | 1 | 2 | 4 |
which is fine. Now I want to set the category colums dynamically. So when an additional category in "clicks" appears, e.g. "facebook", I do not have to alter the query for getting
| articleID | authorID | rss | mail | app | facebook | total |
| 100 | 2 | 1 | 2 | 0 | 1 | 4 |
| 101 | 2 | 1 | 1 | 2 | 0 | 4 |
I found this one
WITH categories AS (SELECT category from clicks GROUP BY category)
as a prefix to the query But how do I access categories in the main query, something like
WITH categories AS (SELECT category from clicks GROUP BY category)
SELECT clicks.articleID,
ANY_VALUE(article.authorID) AS authorID,
--pseudocode
FOREACH(categories AS category)
COUNT(CASE WHEN clicks.category =$category THEN 1 END) as $category
ENDFOREACH
--//pseudocode
COUNT(clicks.articleID) AS total
FROM clicks
INNER JOIN articles
ON clicks.articleID = articles.articleID
GROUP BY clicks.articleID
?
Of course I could first query all categories and then run a query for each category, but I hope that there is a more elegant way