I'm new to SQL and I have problems when trying to make an hourly report on a database that supports HiveSQL.
Here's my dataset
|NAME| CHECKIN_HOUR |CHECKOUT_HOUR|
|----|--------------|-------------|
| A | 00 | 00 |
| B | 00 | 01 |
| C | 00 | 02 |
| D | 00 | null |
| E | 01 | 02 |
| F | 01 | null |
And I would like to get an hourly summary report that looks like this:
|TIME| CHECKIN_NUMBER |CHECKOUT_NUMBER|STAY_NUMBER|
|----|----------------|---------------|-----------|
| 00 | 4 | 1 | 3 |
| 01 | 2 | 1 | 4 |
| 02 | 0 | 2 | 2 |
stay_number
means counting the number of people that haven't checked out by the end of that hour, e.g 2
at the last row means that by the end of 2am, there're two people (D and F) haven't checked out yet. So basically I'm trying to get a summarize check-in, check-out and stay report for each hour.
I've no idea how to compute an hourly interval table since simply grouping by check_in or check_out hour doesn't get the expected result. All the date field is originally in Unix timestamp data type, so feel free to use date functions on it.
Any instructions and help would be greatly appreciated, thanks!