I'm aggregating (summing) some data from a purchases table, aggregated by total amount per region.
Data looks something like the following:
| id | region | purchase_amount |
| 1 | A | 30 |
| 2 | A | 35 |
| 3 | B | 41 |
The aggregated data then looks like this, ordered by total_purchases:
| region | total_purchases |
| B | 1238 |
| A | 910 |
| D | 647 |
| C | 512 |
I'd like to get a ranking for each region, ordered by total_purchases. I can do this using row_number (using SQLAlchemy at the moment) and this results in a table looking like:
| rank | region | total_purchases |
| 1 | B | 1238 |
| 2 | A | 910 |
| 3 | D | 647 |
| 4 | C | 512 |
However, there's one more column that I'd like to group by and that's:
I want region 'C' to always be the first row, but keep it's ranking.
This would ideally result in a table looking like:
| rank | region | total_purchases |
| 4 | C | 512 |
| 1 | B | 1238 |
| 2 | A | 910 |
| 3 | D | 647 |
I can do one or the other, but I can't seem to combine these 2 features together. If I use a row_number()
function, I get the proper ordering.
I can bring the region 'C' row always to the top using an ordering across 2 columns:
ORDER BY
CASE WHEN region = 'C' THEN 1 ELSE 0 DESC,
total_purchases DESC
However, I can't seem to combine these 2 requirements into the same query.