I have a table that is arranged as such, need to get the average time each step takes, and am using the updated_at column to do so.
workflow_id | step_name | created_at | updated_at |
---|---|---|---|
25 | data_capture | 2022-03-21 10:20:34 | 2022-03-21 10:20:55 |
25 | client_signature | 2022-03-21 10:20:34 | 2022-03-22 17:25:15 |
25 | pm_signature | 2022-03-21 10:20:34 | 2022-03-22 23:05:12 |
105 | data_capture | 2022-03-24 05:20:34 | 2022-03-24 10:20:55 |
105 | client_signature | 2022-03-24 05:20:34 | 2022-03-24 17:25:15 |
105 | pm_signature | 2022-03-24 05:20:34 | 2022-03-24 23:05:12 |
My issue is with the query I have, it is subtracting row 4 from row 3 which are unrelated workflows and making the time spent on data_capture for workflow id 105 incorrect.
I have tried using LAG, LEAD, PARTITION BY, to try and create new rows but I have not found a way to keep the subtractions separate by workflow ID. Here is how the calculation should look:
- updated_at from row 1 - created_at for row 1 = time spent on data_capture
- updated_at from row 2 - value from step 1 = time spent on client_signature
- updated_at from row 3 - value from step 2 = time spent on pm_signature
- process restarts for workflow 105
Below are some of my attempts
`WITH tb1 AS (
SELECT
workflow_id,
workflow_type,
step_name,
step_status,
step_created_at,
step_updated_at,
TIMESTAMPDIFF(SECOND, updated_time, step_updated_at) / 3600 elapsed_time
FROM (
SELECT *,
LAG(step_updated_at) OVER (ORDER BY step_updated_at) updated_time
FROM view_client_workflow_status
) s
)
SELECT
step_name,
AVG(elapsed_time)
FROM tb1
WHERE workflow_type = 'client_onboarding_fcc_ip'
GROUP BY 1`