I have a very large table, millions of rows, with many columns, amongst them, Service, Start, and End. Service is a name, Start and End indicate outage times.
The table looks something like:
Service | Start | End
LAN | 1/1 12:00 | 3/1 12:00
LAN | 2/1 14:00 | 3/1 14:00
WAN | 5/1 10:00 | 7/1 08:00
WAN | 6/1 08:00 | 7/1 10:00
The aim is to create an aggregate table listing Service, date and outage length, something like:
Service | Date | Outage length
LAN | 1/1 | 12 h
LAN | 2/1 | 24 h
LAN | 3/1 | 14 h
WAN | 5/1 | 14 h
WAN | 6/1 | 24 h
WAN | 7/1 | 10 h
My workflow is as follows:
Step 1. To calculate overlapping times I have used this Stack Overflow solution. It works fine. Based on my input table, the output looks like:
Service | Start | End
LAN | 1/1 12:00 | 3/1 14:00
WAN | 5/1 10:00 | 6/1 10:00
Step 2. I've added a Date column, and a calculated Outage column to get the final result.
It works, big thanks to @horseyride for the original solution, but is it very slow and takes forever to run. I realised that there are a lot of records but I'm willing to bet that there is a better workflow algorithm which could speed up the process dramatically.