I have the following two tables
People
+--------+---------------+-------------+
| Name | ContractStart | ContractEnd |
+--------+---------------+-------------+
| Kate | 20180101 | 20181231 |
| Sawyer | 20180101 | 20181231 |
| Ben | 20170601 | 20181231 |
+--------+---------------+-------------+
Shifts
+---------+--------+------------+----------+
| Station | Name | ShiftStart | ShiftEnd |
+---------+--------+------------+----------+
| Swan | Kate | 20180101 | 20180131 |
| Arrow | Kate | 20180301 | 20180331 |
| Arrow | Kate | 20180401 | 20181231 |
| Flame | Sawyer | 20180101 | 20181231 |
| Swan | Ben | 20180101 | 20181231 |
+---------+--------+------------+----------+
It means that, for example, Kate will be available from 20180101 to 20181231. In this period of time she will work at station Swan from 20180101 to 20180131, at station Arrow from 20180301 to 20180331 and from 20180401 to 20181231.
My goal is to come to the following table
+------+---------------+-------------+
| | VacationStart | VacationEnd |
+------+---------------+-------------+
| Kate | 20180201 | 20180228 |
| Ben | 20170601 | 20171231 |
+------+---------------+-------------+
that means that Kate will be free from 20180201 to 20180228.
My first idea was to create a table with every day of the 2017 and 2018, let's say a CalTable
, then JOIN the table with People
to find every day that every person should be available. At this point JOIN again the resulting table with Shifts
to have evidence of the days NOT BETWEEN ShiftStart AND ShiftEnd
.
This steps give me correct results but are very slow, considering that I have almost 1.000.000 of person and usually between ContractStart
and ContractEnd
the are 10-20 years.
What could be a correct approach to get the results in a more clever and fast way?
Thanks. This is the data of the example on db<>Fiddle
For @ A_Name_Does_Not_Matter this is my attempt
CREATE TABLE #CalTable([ID] VARCHAR(8) NOT NULL)
DECLARE @num int
SET @num = 20170101
WHILE (@num <= 20181231)
BEGIN
INSERT INTO #CalTable([ID])
SELECT @num AS [ID]
SET @num = @num + 1
END
SELECT X.[Name], X.[TIMEID]
FROM (
-- All day availables
SELECT DISTINCT A.[Name],B.[ID] AS [TIMEID]
FROM #People A INNER JOIN #CalTable B
ON B.[ID] BETWEEN A.[ContractStart] AND A.[ContractEnd]
) X
LEFT JOIN (
-- Working day
SELECT DISTINCT A.[Name],B.[ID] AS [TIMEID]
FROM #People A INNER JOIN #CalTable B
ON B.[ID] BETWEEN A.[ContractStart] AND A.[ContractEnd]
INNER JOIN #Shifts C ON A.[Name]=C.[Name] AND B.[ID] BETWEEN C.[ShiftStart] AND C.[ShiftEnd]
) Z
ON X.[Name]=Z.[Name] AND X.[TIMEID]=Z.[TIMEID]
WHERE Z.[Name] IS NULL
ORDER BY X.[Name],X.[TIMEID]
and then aggregate the dates witk this query.