I want to weekly track the status of each user's record and whether they changed their status. Currently, there's a table to weekly update the records of each status. Example as below.
Weekly update Table (First Week) (2023-01-01)
Section |
Name |
Status |
---|---|---|
Sect.A |
Jane |
Inactive |
Sect.A |
Stephen |
Active |
Sect.B |
Jane |
Active |
Weekly update Table (Second Week) (2023-01-08)
Section |
Name |
Status |
---|---|---|
Sect.A |
Jane |
Active |
Sect.A |
Stephen |
Active |
Sect.B |
Jane |
Active |
Sect.C |
Chris |
Active |
Weekly update Table (Second Week) (2023-01-15)
Section |
Name |
Status |
---|---|---|
Sect.A |
Jane |
Regular |
Sect.A |
Stephen |
Active |
Sect.B |
Jane |
Active |
Sect.C |
Chris |
Inactive |
I want to design a data-traced table to observe the historical change for all users by weekly running the scheduler. Expected Table layout as below
First Week generated at 2023/01/01
UID |
Section |
Name |
Current Status |
Previous Status |
Updated_at |
---|---|---|---|---|---|
001 |
Sect.A |
Jane |
Active |
2023-01-01 |
|
002 |
Sect.A |
John |
Inactive |
2023-01-01 |
|
003 |
Sect.B |
Jane |
Active |
2023-01-01 |
Second Week generated at 2023/01/08
UID |
Section |
Name |
Current Status |
Previous Status |
Updated_at |
---|---|---|---|---|---|
001 |
Sect.A |
Jane |
Active |
2023-01-01 |
|
001 |
Sect.A |
Jane |
Inactive |
Active |
2023-01-08 |
002 |
Sect.A |
Stephen |
Active |
2023-01-01 |
|
003 |
Sect.B |
Jane |
Active |
2023-01-01 |
|
004 |
Sect.C |
Chris |
Active |
2023-01-08 |
Third Week generated at 2023/01/15
UID |
Section |
Name |
Current Status |
Previous Status |
Updated_at |
---|---|---|---|---|---|
001 |
Sect.A |
Jane |
Active |
2023-01-01 |
|
001 |
Sect.A |
Jane |
Inactive |
Active |
2023-01-08 |
001 |
Sect.A |
Jane |
Regular |
Inactive |
2023-01-15 |
002 |
Sect.A |
Stephen |
Active |
2023-01-01 |
|
003 |
Sect.B |
Jane |
Active |
2023-01-01 |
|
004 |
Sect.C |
Chris |
Active |
2023-01-08 |
|
004 |
Sect.C |
Chris |
Inactive |
Active |
2023-01-15 |
I was trying to using merge statement in Bigquery to make it, however, Im not that pretty sure wthether merge statement can make it or not. THought it's close but still incomplete.
Here's my work.
<pre><code>
CREATE TABLE IF NOT EXISTS `historical_table (
UID STRING,
SECTION STRING,
NAME STRING,
CURRENT_STATUS STRING,
PREVIOUS_STATUS STRING,
UPDATED_AT DATE
);
</code></pre>
<pre><code>
MERGE `historical_table`
USING(
---- get the last record for each uid
WITH lastest_record AS(
SELECT
*
FROM (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY section, name ORDER BY updated DESC) rnk
FROM
historical_table
)WHERE rnk = 1
)
SELECT
l.uid,
w.section,
w.name,
w.status as current_status,
l.status as previous_status,
CURRENT_DATE as updated_at
FROM
`weekly_update_table` as w
LEFT JOIN
lastest_record as l
ON w.section = l.section
AND w.name = l.name
) as weekly_table
ON historical_table.section = weekly_table.section
AND historical_table.name = weekly_table.name
AND historical_table.current_status != IFNULL(weekly_table.current_status, historical_table.current_status)
WHEN NOT MATCHED BY TARGET
THEN INSERT(
uid,
section,
name,
current_status,
previous_status,
updated_at)
VALUES
(generated_uuid(),
weekly_table.section,
weekly_table.name,
weekly_table.current_status,
"",
CURRENT_DATE()
)
</code></pre>