1

I have the following two tables

People

+--------+---------------+-------------+
|  Name  | ContractStart | ContractEnd |
+--------+---------------+-------------+
| Kate   |      20180101 |    20181231 |
| Sawyer |      20180101 |    20181231 |
| Ben    |      20170601 |    20181231 |
+--------+---------------+-------------+

Shifts

+---------+--------+------------+----------+
| Station |  Name  | ShiftStart | ShiftEnd |
+---------+--------+------------+----------+
| Swan    | Kate   |   20180101 | 20180131 |
| Arrow   | Kate   |   20180301 | 20180331 |
| Arrow   | Kate   |   20180401 | 20181231 |
| Flame   | Sawyer |   20180101 | 20181231 |
| Swan    | Ben    |   20180101 | 20181231 |
+---------+--------+------------+----------+

It means that, for example, Kate will be available from 20180101 to 20181231. In this period of time she will work at station Swan from 20180101 to 20180131, at station Arrow from 20180301 to 20180331 and from 20180401 to 20181231.

My goal is to come to the following table

+------+---------------+-------------+
|      | VacationStart | VacationEnd |
+------+---------------+-------------+
| Kate |      20180201 |    20180228 |
| Ben  |      20170601 |    20171231 |
+------+---------------+-------------+

that means that Kate will be free from 20180201 to 20180228.

My first idea was to create a table with every day of the 2017 and 2018, let's say a CalTable, then JOIN the table with People to find every day that every person should be available. At this point JOIN again the resulting table with Shifts to have evidence of the days NOT BETWEEN ShiftStart AND ShiftEnd. This steps give me correct results but are very slow, considering that I have almost 1.000.000 of person and usually between ContractStart and ContractEnd the are 10-20 years.

What could be a correct approach to get the results in a more clever and fast way?

Thanks. This is the data of the example on db<>Fiddle


For @ A_Name_Does_Not_Matter this is my attempt

CREATE TABLE #CalTable([ID] VARCHAR(8) NOT NULL)

DECLARE @num int
SET @num = 20170101

WHILE (@num <= 20181231)
BEGIN
    INSERT INTO #CalTable([ID])
    SELECT @num AS [ID]

    SET @num = @num + 1
END

SELECT X.[Name], X.[TIMEID]
FROM (
    -- All day availables
    SELECT DISTINCT A.[Name],B.[ID] AS [TIMEID]
    FROM #People A INNER JOIN #CalTable B
    ON B.[ID] BETWEEN A.[ContractStart] AND A.[ContractEnd]
) X
LEFT JOIN (
    -- Working day
    SELECT DISTINCT A.[Name],B.[ID] AS [TIMEID]
    FROM #People A INNER JOIN #CalTable B
    ON B.[ID] BETWEEN A.[ContractStart] AND A.[ContractEnd]
    INNER JOIN #Shifts C ON A.[Name]=C.[Name] AND B.[ID] BETWEEN C.[ShiftStart] AND C.[ShiftEnd]
) Z
ON X.[Name]=Z.[Name] AND X.[TIMEID]=Z.[TIMEID]
WHERE Z.[Name] IS NULL
ORDER BY X.[Name],X.[TIMEID]

and then aggregate the dates witk this query.

Nicolaesse
  • 2,554
  • 12
  • 46
  • 71

1 Answers1

1

so a persons start date could be the start of a vacation, and you can find the end of that vacation by finding the date of their first shift (minus 1 day) by using CROSS APPLY to get the TOP 1 shift, ORDERED BY DATE

In an unusual situation that they have no shifts, their vacation ends on their contract end date.

Future vacations then start the day after a shift, and end the day before the next shift (can be found by OUTER APPLY) and defaulted to contracted end date if there is no further shift

SELECT p.name, p.contractStart vacationstart, p.ContractEnd vacationend from people p WHERE not exists(select 1 from shifts s where p.name = s.name)
UNION
SELECT p2.name,
        p2.contractStart vacationstart, 
        dateadd(day,-1,DQ.ShiftStart) as vacationend 
            from PEOPLE P2
            CROSS APPLY 
                (SELECT TOP 1 s2.ShiftStart FROM shifts s2 WHERE p2.name = s2.name  order by sfiftstart) DQ
                WHERE DQ.ShiftStart > p2.contractstart

UNION
select P3.NAME,
       dateadd(day,1,s3.ShiftEnd) vacationstart,
       COALESCE(dateadd(day,-1,  DQ2.shiftStart),P3.ContractEnd)  --you might have to add handling yourself for  removing a case where they work on their contract end date
       FROM people p3 JOIN shifts s3 on p3.name = s3.name
       OUTER APPLY (SELECT TOP 1 s4.shiftStart 
                                    from shifts s4
                                        where s4.name = p3.name 
                                              and 
                                              s4.shiftstart > s3.shiftstart 
                                        order by s4.shiftstart) DQ2

it's hard for me to verify without test data. For an employee, what I seek is.

Contract Start, Shift1Start - 1

Shift1End + 1, Shift2Start - 1

Shift2End + 1, Shift3Start - 1

Shift3End + 1, ContractEnd

then add the case with 'no shifts' finally shifts may be contiguous, leading to vacations of duration of zero or less - you could filter these by making the query a sub query, and simply filtering

Cato
  • 3,652
  • 9
  • 12
  • Your query is one step to the solution, I have included it in a subquery and add WHERE vacationstart <= vacationend. Otherwise it shows correct results and others results like VacationStart=2019-01-01 and VacationEnd=2018-12-31. Thank you very much! – Nicolaesse Dec 13 '18 at 12:46