-1

I have a student table and I would like to know How long their session/training has lasted. I would like to exclude weekends, but I want to count continuous days excluding weekends. A class has Start Date and End Date, for example, student ID S1 a can book a class on Jan then again on Feb and I would like to know how many days for Jan bookings and February while excluding weekends. Basically, I am looking for continuous dates from Start Date to End date by student id where there no break except for the weekend.

SELECT 
 [ID]
,[StartDate]
,[EndDate]
,[BookingDays] AS Consecutive_Booking
FROM StudentBooking

If the student (student classifications(Type)) has book the class for 5 days or 2 times (Start Date to End Date (Monday to Friday)) in the last 3 months they are Resident else Visitors. Start Date and End date is recorded as of Monday - Friday only. Please note student ID 1 has a continuous date and this should be counted as a block. (02/01/2018-12/01/2018) second block 22/01-26/01

I would like to replicate below table.

ID   StartDate  EndDate     Duration     Type
1   02/01/2018  05/01/2018              ==>Please Note have continous dates
1   08/01/2018  12/01/2018   9           Resident
1   22/01/2018  26/01/2018   5           Resident 
2   23/01/2018  26/01/2018   4           Visitor
3   29/01/2018  31/01/2018   3           Visitor
SQLfun
  • 53
  • 3
  • 15
  • Please read [this](http://spaghettidba.com/2015/04/24/how-to-post-a-t-sql-question-on-a-public-forum/) for some tips on improving your question. It's helpful to tag database questions with both the appropriate software (MySQL, Oracle, DB2, ...) and version, e.g. `sql-server-2014`. Differences in syntax and features often affect the answers. Note that `tsql` narrows the choices, but does not specify the database. – HABO Jul 07 '18 at 21:12
  • Possibly useful hint: `( @@DateFirst + DatePart( weekday, SampleDate ) - 1 ) % 7 + 1` will always return an integer from `1` to `7` with `1` corresponding to Sunday regardless of the setting of `DateFirst` or `Language`. That and a _numbers table_ provides one approach. – HABO Jul 07 '18 at 21:16
  • How to handle separate bookings for the same student? Should they somehow be considered one if there's just a weekend in betreen? Or do you just want to know the number of days excluding weekends for every row (booking)? – Wolfgang Kais Jul 07 '18 at 22:22
  • @SQLfun So one part of the problem ist to identify continuous sessions, right? Wouldn't it be a lot easier to have a kind of "SessionID" in the table? Also: isn't there a need to differentiate between courses? Do you mind include some sample data and a "desired result"? – Wolfgang Kais Jul 07 '18 at 23:01
  • @SQLfun I see that you included sample data… what about the gap between the first end date (12/01/2018) and the second start date (25/01/2018)? This isn't just a weekend. How would the desired result look like? Should the "type" be determined by the solution or is it already stored in the table? – Wolfgang Kais Jul 07 '18 at 23:14
  • @SQLfun Sorry to say that, but I still can't see the continuous block (02/01/2018-28/01/2018 and 24/03-25/03). The first block in your table I see is 9 days long. – Wolfgang Kais Jul 07 '18 at 23:27
  • @ WolfgangK 1 yeah rectified now – SQLfun Jul 07 '18 at 23:40
  • @WolfgangK where are you filtering the last 3 months they are Resident else Visitors.(I want to test it with like last 6 months one year, and all the records) – SQLfun Jul 08 '18 at 11:48

1 Answers1

0

Here's my solution to your problem.

In the CTE "comparison", I join every record with this and all following records for that student. This way, I have a possible starting point (from the left side of the join) of a continuous training block and a possible end of such a block (from the right side of the join). Using "cross applies", I calculate 2 values:

  1. the workdays from the start of the possible first to the end of the possible last interval of a chain
  2. the workdays in just the possible last interval in the chain.

On the latter values, using a windows function, I build a running total of the workdays from the possible start and end intervals. You tagged the question with "SQL 2012", so using this window function should be possible.

In the next CTE ("sorting"), I restict the previous results to those where the running total equals the workdays between first start date and last end date. This way, only continuous blocks are left. These are then numbered in 2 ways:

  1. continuous blocks with the same EndDate are numberd by ascending StartDate
  2. continuous blocks with the same StartDate are numberd by descending EndDate.

For every EndDate, I want the earliest StartDate, and for this StartDate, I want the latest EndDate only, so I filter for 1 in both numberings. Here it is:

WITH
  comparison (ID, StartDate, EndDate, TotalDays, SumSingleDays) AS (
    SELECT bStart.ID, bStart.StartDate, bEnd.EndDate, Workdays.Total
      , SUM(Workdays.Single) OVER (
          PARTITION BY bStart.ID, bStart.StartDate 
          ORDER BY bEnd.StartDate
          ROWS UNBOUNDED PRECEDING)
    FROM StudentBookings bStart
      INNER JOIN StudentBookings bEnd 
        ON bStart.ID = bEnd.ID AND bStart.StartDate <= bEnd.StartDate
      CROSS APPLY (VALUES (
        DATEDIFF(day, 0, bStart.StartDate), 
        DATEDIFF(day, 0, bEnd.StartDate), 
        1+DATEDIFF(day, 0, bEnd.EndDate))
      ) d (s1, s2, e2)
      CROSS APPLY (VALUES (
        (d.e2 - d.s1) - (d.e2/7 - d.s1/7) - ((d.e2+1)/7 - (d.s1+1)/7),
        (d.e2 - d.s2) - (d.e2/7 - d.s2/7) - ((d.e2+1)/7 - (d.s2+1)/7))
      ) Workdays (Total, Single)
  ),
  sorting (ID, StartDate, EndDate, Duration, RowNumStart, RowNumEnd) AS (
    SELECT ID, StartDate, EndDate, TotalDays
      , ROW_NUMBER() OVER (PARTITION BY ID, EndDate ORDER BY StartDate)
      , ROW_NUMBER() OVER (PARTITION BY ID, StartDate ORDER BY EndDate DESC)
    FROM comparison
    WHERE TotalDays = SumSingleDays
  )
SELECT ID, StartDate, EndDate, Duration
  , CASE WHEN Duration >= 5 THEN 'Resident' ELSE 'Visitor' END AS [Type]
FROM sorting 
WHERE (RowNumStart = 1) 
  AND (RowNumEnd = 1)
ORDER BY ID, StartDate;

The results:

enter image description here

Maybe there's a more elegant way to solve this using the interval packing solution by Itzik Ben-Gan, I'll post it when I figured that out.

Added:

Additionally, I count the number of bookings of all booking blocks and build a sum by student (ID) to make the "Resident" decision in the end. Bookings are restricted to the last 3 months in the first CTE (comparison):

WITH
  comparison (ID, StartDate, EndDate, TotalDays, CountBookings, SumSingleDays) AS (
    SELECT bStart.ID, bStart.StartDate, bEnd.EndDate, Workdays.Total
      , COUNT(Workdays.Single) OVER (
          PARTITION BY bStart.ID, bStart.StartDate 
          ORDER BY bEnd.StartDate
          ROWS UNBOUNDED PRECEDING)
      , SUM(Workdays.Single) OVER (
          PARTITION BY bStart.ID, bStart.StartDate 
          ORDER BY bEnd.StartDate
          ROWS UNBOUNDED PRECEDING)
    FROM StudentBookings bStart
      INNER JOIN StudentBookings bEnd 
        ON bStart.ID = bEnd.ID AND bStart.StartDate <= bEnd.StartDate
      CROSS APPLY (VALUES (
        DATEDIFF(day, 0, bStart.StartDate), 
        DATEDIFF(day, 0, bEnd.StartDate), 
        1+DATEDIFF(day, 0, bEnd.EndDate))
      ) d (s1, s2, e2)
      CROSS APPLY (VALUES (
        (d.e2 - d.s1) - (d.e2/7 - d.s1/7) - ((d.e2+1)/7 - (d.s1+1)/7),
        (d.e2 - d.s2) - (d.e2/7 - d.s2/7) - ((d.e2+1)/7 - (d.s2+1)/7))
      ) Workdays (Total, Single)
    WHERE bStart.StartDate >= DATEADD(month, -3, GETDATE())
  ),
  sorting (ID, StartDate, EndDate, Duration, CountBookings, RowNumStart, RowNumEnd) AS (
    SELECT ID, StartDate, EndDate, TotalDays, CountBookings
      , ROW_NUMBER() OVER (PARTITION BY ID, EndDate ORDER BY StartDate)
      , ROW_NUMBER() OVER (PARTITION BY ID, StartDate ORDER BY EndDate DESC)
    FROM comparison
    WHERE TotalDays = SumSingleDays
  ),
 counting (ID, StartDate, EndDate, Duration, Bookings) AS (
  SELECT ID, StartDate, EndDate, Duration
    , SUM(CountBookings) OVER (PARTITION BY ID)
  FROM sorting WHERE (RowNumStart = 1) AND (RowNumEnd = 1)
)
SELECT ID, StartDate, EndDate, Duration, Bookings
  , CASE 
      WHEN Duration >= 5 OR Bookings >= 2 THEN 'Resident' ELSE 'Visitor'
    END AS [Type]
FROM counting
ORDER BY ID, StartDate;

Filtering ClasseReferences:

The ClassReference will be taken and filtered from the bStart table reference. To be able to add this field to the final query, it also has to be used to join the bEndtable reference, so only booking intervals with the same ClassReference value will be connected to blocks:

WITH
  comparison (ID, ClassReference, StartDate, EndDate, TotalDays, CountBookings, SumSingleDays) AS (
    SELECT bStart.ID, bStart.ClassReference, bStart.StartDate, bEnd.EndDate, Workdays.Total
      , COUNT(Workdays.Single) OVER (
          PARTITION BY bStart.ID, bStart.StartDate 
          ORDER BY bEnd.StartDate
          ROWS UNBOUNDED PRECEDING)
      , SUM(Workdays.Single) OVER (
          PARTITION BY bStart.ID, bStart.StartDate 
          ORDER BY bEnd.StartDate
          ROWS UNBOUNDED PRECEDING)
    FROM StudentBookings bStart
      INNER JOIN StudentBookings bEnd 
        ON bStart.ID = bEnd.ID AND bStart.StartDate <= bEnd.StartDate
       AND bStart.ClassReference = bEnd.ClassReference
      CROSS APPLY (VALUES (
        DATEDIFF(day, 0, bStart.StartDate), 
        DATEDIFF(day, 0, bEnd.StartDate), 
        1+DATEDIFF(day, 0, bEnd.EndDate))
      ) d (s1, s2, e2)
      CROSS APPLY (VALUES (
        (d.e2 - d.s1) - (d.e2/7 - d.s1/7) - ((d.e2+1)/7 - (d.s1+1)/7),
        (d.e2 - d.s2) - (d.e2/7 - d.s2/7) - ((d.e2+1)/7 - (d.s2+1)/7))
      ) Workdays (Total, Single)
    WHERE bStart.StartDate >= DATEADD(month, -3, GETDATE())
      AND bStart.ClassReference IN (N'C1', N'C2')
  ),
  sorting (ID, ClassReference, StartDate, EndDate, Duration, CountBookings, RowNumStart, RowNumEnd) AS (
    SELECT ID, ClassReference, StartDate, EndDate, TotalDays, CountBookings
      , ROW_NUMBER() OVER (PARTITION BY ID, ClassReference, EndDate ORDER BY StartDate)
      , ROW_NUMBER() OVER (PARTITION BY ID, ClassReference, StartDate ORDER BY EndDate DESC)
    FROM comparison
    WHERE TotalDays = SumSingleDays
  ),
  counting (ID, ClassReference, StartDate, EndDate, Duration, Bookings) AS (
    SELECT ID, ClassReference, StartDate, EndDate, Duration
      , SUM(CountBookings) OVER (PARTITION BY ID, ClassReference)
    FROM sorting WHERE (RowNumStart = 1) AND (RowNumEnd = 1)
  )
SELECT ID, ClassReference, StartDate, EndDate, Duration, Bookings
  , CASE 
      WHEN Duration >= 5 OR Bookings >= 2 THEN 'Resident' ELSE 'Visitor'
    END AS [Type]
FROM counting
ORDER BY ID, StartDate;

Using this data for testing:

Test Data

With a filter for the last 12 months, the query returns:

Result with ClassReference

So Student 1 is "Resident" in class C2 but Visitor in Class C1.

Wolfgang Kais
  • 4,010
  • 2
  • 10
  • 17
  • Wow that really amazing, let me check it out, I had a quick read about Packing Intervals by Itzik Ben-Gan | Apr 13, 2011, and really great, more stuff to learn...Many thanks and I look forward for the second solution. – SQLfun Jul 08 '18 at 10:31
  • Great tested it and is working; where are you filtering the last 3 months they are Resident else Visitors. (I want to test it with other duration/window/time flame from StartDate i.e last 6 months one year, and all the records) – SQLfun Jul 08 '18 at 11:50
  • @SQLfun In fact I forgot that "2 times in the last 3 months" requirement. If I get you right, then if there are 2 bookings in the last 3 months, their durations do not matter? Also: How to determine that a booking belongs to the last 3 months when StartDate and EndDate do not both belong to the 3 months interval? And: When do these 3 months start and end (start 3 month ago from today and end today or start/end on a first/last day of a month)? – Wolfgang Kais Jul 08 '18 at 15:39
  • Many thanks for this: Rule #1 : If they booked class for five days continuously they are Residents else Visitor: Rule #2 If they have booked the class for three times (looking at the Start to End date) they are also Residents else Visitor. If they meet either of these criteria they are Resident else Visitor. – SQLfun Jul 09 '18 at 14:33
  • @SQLfun Thanks, but: The "3 months" restriction still isn't clear to me. Are these "whole" (calendar) months? Relatively to the current date (GETDATE()) or to a booking? And how must StartDate and EndDate fit into these 3 months (only one of them or both)? In your original post you were talking about 2 times... now 3? – Wolfgang Kais Jul 09 '18 at 15:08
  • Great question it will be based on the last date of booking from (GETDATE() – SQLfun Jul 10 '18 at 20:01
  • @SQLfun Sorry, still don't get it... between the last booking day and `GETDATE()` there will always only be that last booking. Please see my edited answer and let me know if that's what you want before I go for the packed interval solution. – Wolfgang Kais Jul 10 '18 at 22:38
  • @ WolfgangK Wow seems to work, still conducting more tests but please continue with Interval solution. if we change anything will be very minor. Amazing and many thanks, I will keep updating you with my tests.but so far good – SQLfun Jul 11 '18 at 22:19
  • @ WolfgangK it looks like we may have differing datatypes in an expression and SQL Server casts them automatically according to the rules of datatype precedence. any workaround/suggestions.Many thanks – SQLfun Jul 13 '18 at 22:00
  • @ WolfgangK Wow wow really great conducted more tests today; created a small table with one student and tested the different date range and worked great, final tests tomorrow. I have used the last solution you have provided.which was edited yesterday..Will update you tomorrow after more tests.Many thanks. – SQLfun Jul 14 '18 at 19:38
  • @SQLfun Suppose a student has 3 consecutive weeks of booking (5 days each), one with class_reference C1, one with C2 and one with C3. After creating a single block of 15 days from these 3 weeks, just one of the 3 class_references can be displayed in the desired extra field. Which one? – Wolfgang Kais Jul 19 '18 at 12:21
  • @ WolfgangK Thanks so much for your help. Great question, I think the best way-way is FIRST to filter this class_reference fields we don't need BEFORE calculations, then perform the calculations. we filter class_reference C2,C3 but I want to have the ability to filter more classes without affecting the final answer or affecting the Query technical structure. Basically, I can add filter more class_reference out/the one I don't need. – SQLfun Jul 20 '18 at 21:29
  • @SQLfun The place to FIRST filter ClassReferences is the WHERE clause of the cte *comparison* (`WHERE bStart.StartDate >= DATEADD(month, -3, GETDATE()) `). You will want to add the criterion to the `bStart` table reference. I added the changed query to my answer. – Wolfgang Kais Jul 20 '18 at 22:11
  • @ WolfgangK Wow that is great, I thought ID1 will have a continuous date StartDate 02/01/2018-05/01/2018 ----StartDate 08/01/2018-01-12 | In theory, this a continuous date from 02/01/2018-12/01/2018 (ID1 has one block of continuous dates)==Total 9 days which is Resident, I think is just something very minor need updating. Wow almost there , great progress, please check Duration if for ID1 is counting okay. – SQLfun Jul 21 '18 at 18:44
  • @ WolfgangK , Great I have used the previous Query and updated bStart.ClassReference IN (N'C1', N'C2') and work great. Final Question what about if I want to discount days these dates which attendance during National holiday i.e. Christmas day and new year. I have table with USA holiday (Holiday name and Holiday date) and I want to discount them when calculating the duration. StartDate of studentID 1 attendance classes from 01/01/2018-02/01/2018 this should show only 1 day as because of new year’s days. And our case statement the student will be a Visitor. Where to connect Holiday tbl – SQLfun Jul 22 '18 at 15:28
  • @SQLfun Including the filter on ClassReference in the previous solution will neither show any class reference in the result nor ignore the ClassReference when calculating the "resident status" of a student. Just imagine that there where 2 conscutive bookings with C3 and just one additional day connected to them with C1 or C2. – Wolfgang Kais Jul 22 '18 at 18:00
  • @SQLfun Excluding the holidays isn't just a small change, and I think that SO isn't the place for agile development. Where we arrived now is far bejond the original requirements that people expect to be solved when they search this site. I suggest to post a new question containing ALL requirements, representative sample data and the desired result with an explanation how to get there manually (or amend the original post accordingly). – Wolfgang Kais Jul 22 '18 at 18:01
  • @ WolfgangK Really great, totally agree, I was going to do that but I thought let check with you before I post, I will post the code also as and then summarize the results required. definitely, this now resolved, now is just a matter of going to another level.Great help. – SQLfun Jul 23 '18 at 10:42