BigQuery - Aggregate rows using standard SQL

Question

Consider this table

+------------+----------------+-------------+
|    date    |   region_name  |  population |
+------------+----------------+-------------+
| 2000-02-11 | Lower Normandy |   1.000.000 |
+------------+----------------+-------------+
| 2000-02-11 | Upper Normandy |   1.100.000 |
+------------+----------------+-------------+
| 2020-04-25 | Lower Normandy |   1.800.000 |
+------------+----------------+-------------+
| 2020-04-25 | Upper Normandy |   1.900.000 |
+------------+----------------+-------------+

I would like to aggregate rows for Lower Normandy and Upper Normandy as Normandy and, according to the date column, sum the population.

The expected result would be:

+------------+----------------+-------------+
|    date    |   region_name  |  population |
+------------+----------------+-------------+
| 2000-02-11 |       Normandy |   2.100.000 |
+------------+----------------+-------------+
| 2020-04-25 |       Normandy |   3.700.000 |
+------------+----------------+-------------+

The result of this aggregation would then be used to create a new view.

How can I do it in BigQuery using the standard SQL?

score 0 · Accepted Answer · answered Apr 25 '20 at 11:30

You can use a case expression to change the name:

select date,
       (case when region_name like '%Normandy' then 'Normandy' else region_name end) as region_name,
       sum(population) as population
from t
group by 1, 2;

This matches anything that ends in "Normandy". Of course, you could use region_name in ('Upper Normandy', 'Lower Normany') for more precision (and more typing).

Or, if you only care about these four rows, you can just assign the region_name:

select date, 'Normandy' as region_name, sum(population) as population
from t
where region_name like '%Normandy'
group by date;

BigQuery - Aggregate rows using standard SQL

1 Answers1