0

I have a table that is arranged as such, need to get the average time each step takes, and am using the updated_at column to do so.

workflow_id step_name created_at updated_at
25 data_capture 2022-03-21 10:20:34 2022-03-21 10:20:55
25 client_signature 2022-03-21 10:20:34 2022-03-22 17:25:15
25 pm_signature 2022-03-21 10:20:34 2022-03-22 23:05:12
105 data_capture 2022-03-24 05:20:34 2022-03-24 10:20:55
105 client_signature 2022-03-24 05:20:34 2022-03-24 17:25:15
105 pm_signature 2022-03-24 05:20:34 2022-03-24 23:05:12

My issue is with the query I have, it is subtracting row 4 from row 3 which are unrelated workflows and making the time spent on data_capture for workflow id 105 incorrect.

I have tried using LAG, LEAD, PARTITION BY, to try and create new rows but I have not found a way to keep the subtractions separate by workflow ID. Here is how the calculation should look:

  1. updated_at from row 1 - created_at for row 1 = time spent on data_capture
  2. updated_at from row 2 - value from step 1 = time spent on client_signature
  3. updated_at from row 3 - value from step 2 = time spent on pm_signature
  4. process restarts for workflow 105

Below are some of my attempts

`WITH tb1 AS (
SELECT
workflow_id, 
workflow_type,
step_name,
step_status,
step_created_at,
step_updated_at,
TIMESTAMPDIFF(SECOND, updated_time, step_updated_at) / 3600 elapsed_time
FROM (
    SELECT *,
        LAG(step_updated_at) OVER (ORDER BY step_updated_at) updated_time
    FROM view_client_workflow_status
    ) s
)
SELECT
step_name,
AVG(elapsed_time)
FROM tb1
WHERE workflow_type = 'client_onboarding_fcc_ip'
GROUP BY 1`
Akina
  • 39,301
  • 5
  • 14
  • 25
  • *updated_at from row 2 - value from step 1* ??? Value from step 1 == the time diff between created_at and updated_at in row 1? If so then updated_at from row 2 (datetime) - value from step 1 (interval) will produce datetime and not interval. Provide desired output for shown source data. – Akina Jan 10 '23 at 04:52
  • Hello Akina, I would like to get an average of the time spent on each step so the output would look like data_capture - 2h client_signature - 3h pm_signature - 4h or whatever the numbers end up being. I think interval is better than datetime in this case for me. – andredelivery Jan 10 '23 at 19:06
  • One more time. Provide desired output for shown source data. – Akina Jan 10 '23 at 19:51

0 Answers0