I have two tables, and would like to use them to help analyze churn.
The first table has subscription data per customer:
ACCT INVOICE_DATE REVENUE FIRST_INVOICE LAST_INVOICE
1234 2021-09-01 10 2021-09-01 2021-12-01
1234 2021-12-01 10 2021-09-01 2021-12-01
5678 2021-06-01 20 2021-06-01 2021-08-01
5678 2021-07-01 20 2021-06-01 2021-08-01
5678 2021-08-01 20 2021-06-01 2021-08-01
The other is a convenience table with all months since 1970, e.g.,
MONTH
1970-01-01
1970-02-01
[ ... ]
2023-02-01
2023-03-01
Note one customer in this example skipped a month, and both customers did not survive to the current year. To accurately deal with these churn events (there can be multiple over the lifetime of the account), I'd like to end up with something like this:
ACCT INVOICE_DATE REVENUE FIRST_INVOICE LAST_INVOICE
1234 2021-09-01 10 2021-09-01 2021-12-01
1234 2021-10-01 NULL 2021-09-01 2021-12-01
1234 2021-11-01 NULL 2021-09-01 2021-12-01
1234 2021-12-01 10 2021-09-01 2021-12-01
1234 2022-01-01 NULL 2021-09-01 2021-12-01
5678 2021-06-01 20 2021-06-01 2021-08-01
5678 2021-07-01 20 2021-06-01 2021-08-01
5678 2021-08-01 20 2021-06-01 2021-08-01
5678 2021-09-01 NULL 2021-06-01 2021-08-01
You can assume the date types for the months in each table are equivalent.
There are other options that I'm exploring with window functions, but I've used this approach in Pandas before and curious how I might translate that to SQL.