I have a list of users, application IDs, and activity timestamps that track the users activity throughout the day. The data is structured such as it shows a line of events for every activity on every application ID: user A will go in to conduct 6 activities on application 123, then switch to application 456 to log 4 activities, return to application 123 again for one additional activity, etc.
I've tried using lead ()
and lag ()
functions but am running into issue with data structure, especially when there's a single activity line for a given appl_id. Below is an example of my data.
|User| APPL_ID | ACTIVITY_TIME
A 123 11/20/2020 08:11:45 AM
A 123 11/20/2020 08:11:45 AM
A 123 11/20/2020 08:11:45 AM
A 123 11/20/2020 08:17:13 AM
A 123 11/20/2020 08:17:13 AM
A 123 11/20/2020 08:30:00 AM
A 456 11/20/2020 09:45:02 AM
A 456 11/20/2020 09:45:02 AM
A 456 11/20/2020 09:55:15 AM
A 456 11/20/2020 09:59:45 AM
A 123 11/20/2020 10:35:00 AM
A 789 11/20/2020 10:45:15 AM
A 789 11/20/2020 10:50:33 AM
B 951 11/20/2020 08:15:15 AM
B 951 11/20/2020 08:15:15 AM
B 951 11/20/2020 08:33:37 AM
B 012 11/20/2020 09:13:00 AM
C 852 11/20/2020 07:45:25 AM
C 852 11/20/2020 07:47:41 AM
C 741 11/20/2020 08:00:22 AM
C 852 11/20/2020 08:25:23 AM
C 852 11/20/2020 08:25:23 AM
C 852 11/20/2020 08:25:23 AM
C 852 11/20/2020 08:29:46 AM
In addition to needing the first and last activity timestamps by user and appl_id, I also need to calculate the time spent on each application by the user and the idle time between applications. Notice the caveat of application 123 at 10:35 where only one activity was logged so the IN and OUT times are both equal:
|User| APPL_ID | IN_TIME | OUT_TIME | IN_OUT_MIN | IDLE_MIN
A 123 11/20/2020 08:11 AM 11/20/2020 08:30 AM 19.0 -
A 456 11/20/2020 09:45 AM 11/20/2020 09:59 AM 14.0 75.0
A 123 11/20/2020 10:35 AM 11/20/2020 10:35 AM 0.0 36.0
A 789 11/20/2020 10:45 AM 11/20/2020 10:50 AM 5.0 10.0
B 951 11/20/2020 08:15 AM 11/20/2020 08:33 AM 18.0 -
B 012 11/20/2020 09:13 AM 11/20/2020 09:13 AM 0.0 50.0
C 852 11/20/2020 07:45 AM 11/20/2020 07:47 AM 2.0 -
C 741 11/20/2020 08:00 AM 11/20/2020 08:00 AM 0.0 13.0
C 852 11/20/2020 08:25 AM 11/20/2020 08:29 AM 4.0 25.0
These are the calculations:
in_out_time = out_time - in_time
idle_min = in_time - previous out_time
If previous OUT time is missing or from earlier date, then the idle_min calculation needs to return a blank.