What is the best practice for joining 'shift' data and other time series data in Tableau? I am working with multiple geo data (from LA to India, UK, NY, Malaysia, Australia, China etc), and a lot of employees work past midnight.
For example, an employee has shift at 9 PM to 6 AM on 2016-07-31. The 'report date' is 2016-07-31 but no time zone information is provided.
This employee does work and there are events (time stamps in UTC) between 2016-07-31 21:00 to 2016-08-01 06:00. When I look at the events though, 7/31 will only have the events between 21:00 and 23:59. If I filter for just July, my calculations will be skewed (the event data will be cut off at midnight even though the shift extended to 6 AM).
I need to make calculations based upon the total time an employee was actually engaged with work (productive) and the total time they were paid. The request is for this to be daily/weekly/monthly.
If anyone can help me out here or give me some talking points to explain this to my superiors, it would be appreciated. This seems like it must be a common scenario. Do I need to request for a new raw data format or is there something I can do on my end?
the shift data only looks like this:
id date regular_hours overtime_hours total_hours
abc 2016-06-17 8 0.52 8.52
abc 2016-06-18 7.64 0.83 8.47
abc 2016-06-19 7.87 0.23 8.1
the event data is more detailed (30 minute interval data on events handled and the time it took to complete those events in seconds):
id date interval events event_duration
abc 2016-06-17 01:30:00 4 688
abc 2016-06-17 02:00:00 6 924
abc 2016-06-17 02:30:00 10 1320
So, you sum up the event_duration for an entire day and you get a number of seconds which was actually spent doing work. You can then compare this to amount of time that the employee was paid to see how efficient the staffing is.
My concern is that the event data has the date and the time (UTC). The payroll data only has a date without any time zone information. This causes inaccuracies when blending data in Tableau because some shifts cross midnight. Is there a way around this or do I need to propose new data requirements?
(FYI - people have been calculating it just based on the date for years most likely without considering time zones before. My assumption is that they just did not realize that this could cause inaccurate results)