I have data in Hive that has two columns of interest. The first is a column (int) that represents a date (YYYYMM) and the second is a column (int) that represents a number of people for that date.
date people
202107 10
202001 2
202002 4
202003 6
201901 3
201902 5
201903 7
201809 11
I'm trying to write a query that sums up the number of people for comparison across two quarters. For example, I want the total of people that are between 202001 and 202003 vs. the total number of people between 201901 and 201903.
The result I would want is:
"2020 Q1" 12
"2019 Q1" 15
I've tried a couple of solutions like using a SELECT CASE however when I'm trying to do a group by you can't use an alias like MySQL in Hive. I'm a little stuck on this and would appreciate some help.