18

I have a problem ,i solve it but i have written a long procedure and i can't be sure that it covers all the possible cases .

The problem:

If i have a main interval time (From A to B), and secondary interval times (Many or no)

(`From X to Y AND From X` to Y` AND X`` to  Y`` AND ....`) 

I want to SUM all parts of My Main interval time (AB) out of secondary intervals in minutes in efficient and the least number of conditions (SQL server Procedure and C# method)?

For Example : If my Main interval From 02:00 to 10:30 And say one secondary interval From 04:00 to 08:00

Now i want this result : ((04:00 - 02:00) + (10:30 -08:00))* 60

Example with graph :

in the first case the result will be :

((X-A) + (B-Y)) * 60

and it will be more complicated when i have many secondary periods.

NOTE:

May be the overlap among the secondary intervals happening only when i have to compare the main period [A,B] to the UNION of at most two parallel sets of secondary intervals .the first set have to contain only one secondary interval and the the second set contains (many or no ) of secondary intervals .For example in the graph comparing [A,B] to (sets of 2,5)the first set (2) consists of one secondary interval and the second set (5) consists of three secondary intervals . and this 's the most worst case ,i need to handle.

For example :

IF my main interval is [15:00,19:40] and i have two sets of secondary intervals .according to my rule at least one of these sets should consists of one secondary interval. say the first set is [11:00 ,16:00] and the second set is consists of say two secondary intervals [10:00,15:00],[16:30,17:45] Now i want the result (16:30 -16:00) +(19:40 -17:45)


According to the comments :

My table is like this :

The first table contains secondary periods ,at most two sets of secondary periods in the same date for specific employee. the first set contains only one secondary period in the work day (W) [work_st,work_end],and this set will be empty if the day is weekend [E] and in this case no overlap among the secondary periods. and the second set may contain many secondary periods in the same date [check_in,check_out] ,because the employee may check_in_out many times in the same day.

emp_num  day_date   work_st    work_end   check_in   check_out     day_state

  547    2015-4-1   08:00       16:00     07:45      12:10           W
  547    2015-4-1   08:00       16:00     12:45      17:24           W
  547    2015-4-2   00:00       00:00     07:11      13:11           E

The second table contains the main period[A,B] and it's a one period for this employee at that day (one record)

emp_num  day_date   mission_in    mission_out
  547    2015-4-1    15:00          21:30
  547    2015-4-2    8:00           14:00

In the previous example if i have a procedure or method as required this procedure should take two parameters :

  • The Date
  • The emp_num

in the previous example it should be like this ('2015-4-1' ,547)

According to my explanation :

  • The Main period (Mission Period) [A,B] from the second table : Should be only one period in this date for that employee

    [15:00,21:30]

  • The secondary period for the passed date ('2015-4-1') for that employee was two sets of secondary periods (the worst case) from the first table

    The first set should contain only one secondary period (or zero periods) [08:00,16:00] the second set could contain many secondary periods (or zero periods)

    [07:45,12:10],[12:45,17:24]

The output should be [17:24,21:30] converted to minutes

Note

all day_date,mission_in,mission_out,work_st,work_end,check_in,check_out are datetime fields but i put just the time in the example for simplification , i want to ignore the date part except the day_date because it's the date which i calculate based on in addition to the emp_num.

enter image description here

Hossein Narimani Rad
  • 31,361
  • 18
  • 86
  • 116
Anyname Donotcare
  • 11,113
  • 66
  • 219
  • 392
  • Isn't this just the length of A-B minus the length of all intervals it contains, but not lower than 0? – Gert Arnold Apr 09 '15 at 21:45
  • @GertArnold : yes but some times parts of the secondary intervals is out the main period like case `2` so the result will be `(B-Y)` , or there are many secondary intervals or no secondary intervals at all – Anyname Donotcare Apr 09 '15 at 21:49
  • First reduce all sec. intervals to at least starting at A and at most ending at B. I think the sec. intervals don't overlap each other? – Gert Arnold Apr 09 '15 at 21:52
  • no ,may be the overlap happening when i have to compare the main period `[A,B]` to the `UNION` of at most `two parallel sets of secondary intervals` the first set have to contain `only one secondary interval` and the the second set contains (many or no ) of secondary intervals .For example in the graph comparing `[A,B]` to (sets of `2,5`)the first set `(2)` consists of one secondary interval and the second set `(5)` consists of many secondary intervals . and this 's the most worst case ,i need to handling – Anyname Donotcare Apr 09 '15 at 22:05
  • Can you show some simple sample data for multiple secondary intervals along with what the output should be? – Tanner Apr 09 '15 at 22:09
  • What is the source of the multiple intervals? Is it a single record with multiple columns, or multiple records each with their own start/end, or is the multiple intervals not part of the database at all? – Robert McKee Apr 09 '15 at 22:15
  • @Tanner: i edit my question with an example – Anyname Donotcare Apr 09 '15 at 22:24
  • @RobertMcKee : say one of the two sets of secondary intervals is the work period `[start_time,end_time]` and the second set which has many secondary intervals is the employee check ins_outs and the main period is a specific mission and i have all these intervals in the db. – Anyname Donotcare Apr 09 '15 at 22:27
  • 1
    I can understand the problem a bit better now, but it would help if you showed us how the data is stored... if it is currently stored? I think this can be solved, but anyone answering is going to be making assumptions of how you are storing the data. Some actual table schema with dummy data would be useful. – Tanner Apr 10 '15 at 08:48
  • Also are you storing time only or date and time? – Tanner Apr 10 '15 at 08:57
  • @Tanner: i store it as datetime but my case on the same date so i consider only the time part – Anyname Donotcare Apr 10 '15 at 09:47
  • @Tanner: i re_edit my question with table data example – Anyname Donotcare Apr 10 '15 at 10:15
  • I've also had to program this. I *highly* recommend you program it rather than putting to logic in a sproc. you can then unit test the various scenarios. if your problem is anything like mine then you ll have to worry about daylight savings, timezones, laws about whether to use the shift start or the clock intime etc etc – Ewan Apr 13 '15 at 16:22
  • @just_name, please add at least 5 more substantially different examples of your data to the one that you already included in the question. Add more rows to your tables. Include various combinations of how intervals may overlap. Most importantly, include the expected correct result for your sample data. This will greatly help to understand the logic of your task. Also, add a tag to the question with the version of SQL Server that you use. – Vladimir Baranov Apr 14 '15 at 11:09
  • @just_name. Your expected result is very important for a good answer. It is quite possible that calculating just one number (`SUM` of minutes) may be done more efficiently than generating a complete list of intervals which contribute to this `SUM`. – Vladimir Baranov Apr 14 '15 at 12:57
  • @VladimirBaranov: i tag my sql server version , the graph and the examples explain all possible states – Anyname Donotcare Apr 14 '15 at 18:29
  • @just_name, the tag with SQL Server version is very good, but the most important thing is to include what result you expect. I personally can't understand what logic you want to implement. I didn't downvote this question, but it is quite possible that those people who did had the same problem. You show 3 rows in `Table1` with `emp_num=547` and 2 rows in `Table2` with the same `emp_num=547`. So, what result should the query produce with this data? One row? Two rows? Three rows? The nice looking chart doesn't really help. – Vladimir Baranov Apr 14 '15 at 22:14
  • @VladimirBaranov : ok , i'll illustrate the idea more .the first table has 3 rows (two rows for specific date `1-4`, and the other for different date '2-4' ), the second table contains two rows for two different dates . my calcs should be performed per date for specific user . – Anyname Donotcare Apr 15 '15 at 09:23
  • @VladimirBaranov: i edit the question to clarify the example :) – Anyname Donotcare Apr 15 '15 at 09:55

5 Answers5

3

I had to solve this problem to digest some scheduling data. This allows multiple online times, but assumes that they do not overlap.

select convert(datetime,'1/1/2015 5:00 AM') StartDateTime,  convert(datetime,'1/1/2015 5:00 PM') EndDateTime, convert(varchar(20),'Online') IntervalType into #CapacityIntervals
insert into #CapacityIntervals select '1/1/2015 4:00 AM' StartDateTime,  '1/1/2015 6:00 AM' EndDateTime, 'Offline' IntervalType
insert into #CapacityIntervals select '1/1/2015 5:00 AM' StartDateTime,  '1/1/2015 6:00 AM' EndDateTime, 'Offline' IntervalType
insert into #CapacityIntervals select '1/1/2015 10:00 AM' StartDateTime,  '1/1/2015 12:00 PM' EndDateTime, 'Offline' IntervalType
insert into #CapacityIntervals select '1/1/2015 11:00 AM' StartDateTime,  '1/1/2015 1:00 PM' EndDateTime, 'Offline' IntervalType
insert into #CapacityIntervals select '1/1/2015 4:00 PM' StartDateTime,  '1/1/2015 6:00 PM' EndDateTime, 'Offline' IntervalType
insert into #CapacityIntervals select '1/1/2015 1:30 PM' StartDateTime,  '1/1/2015 2:00 PM' EndDateTime, 'Offline' IntervalType

    --Populate your Offline table
    select 
        ROW_NUMBER() over (Order by StartDateTime, EndDateTime) Rownum,
        StartDateTime, 
        EndDateTime
    into #Offline
    from #CapacityIntervals
    where IntervalType in ('Offline','Cleanout')
    group by StartDateTime, EndDateTime

    --Populate your Online table
    select 
        ROW_NUMBER() over (Order by StartDateTime, EndDateTime) Rownum,
        StartDateTime, 
        EndDateTime
    into #Online
    from #CapacityIntervals
    where IntervalType not in ('Offline','Cleanout')


    --If you have overlapping online intervals... check for those here and consolidate.


    -------------------------------
    --find overlaping offline times
    -------------------------------
    declare @Finished as tinyint
    set @Finished = 0

    while @Finished = 0
    Begin
        update #Offline
        set #Offline.EndDateTime = OverlapEndDates.EndDateTime
        from #Offline
        join
            (
            select #Offline.Rownum,
                MAX(Overlap.EndDateTime) EndDateTime
            from #Offline
            join #Offline Overlap
            on Overlap.StartDateTime between #Offline.StartDateTime and #Offline.EndDateTime
                and #Offline.Rownum <= Overlap.Rownum
            group by #Offline.Rownum
            ) OverlapEndDates
        on #Offline.Rownum = OverlapEndDates.Rownum

        --Remove Online times completely inside of online times
        delete #Offline
        from #Offline
        join #Offline Overlap
        on #Offline.StartDateTime between Overlap.StartDateTime and Overlap.EndDateTime
            and #Offline.EndDateTime between Overlap.StartDateTime and Overlap.EndDateTime
            and #Offline.Rownum > Overlap.Rownum

        --LOOK IF THERE ARE ANY MORE CHAINS LEFT    
        IF NOT EXISTS(
                select #Offline.Rownum,
                    MAX(Overlap.EndDateTime) EndDateTime
                from #Offline
                join #Offline Overlap
                on  Overlap.StartDateTime between #Offline.StartDateTime and #Offline.EndDateTime
                    and #Offline.Rownum < Overlap.Rownum
                group by #Offline.Rownum
                )
            SET @Finished = 1
    END

    -------------------------------
    --Modify Online times with offline ranges
    -------------------------------

    --delete any Online times completely inside offline range
    delete #Online 
    from #Online
    join #Offline
    on #Online.StartDateTime between #Offline.StartDateTime and #Offline.EndDateTime
        and #Online.EndDateTime between #Offline.StartDateTime and #Offline.EndDateTime

    --Find Online Times with offline range at the beginning
    update #Online
    set #Online.StartDateTime = #Offline.EndDateTime
    from #Online
    join #Offline
    on #Online.StartDateTime between #Offline.StartDateTime and #Offline.EndDateTime
        and #Online.EndDateTime >= #Offline.EndDateTime

    --Find Online Times with offline range at the end
    update #Online
    set #Online.EndDateTime = #Offline.StartDateTime
    from #Online
    join #Offline
    on #Online.StartDateTime <= #Offline.StartDateTime
        and #Online.EndDateTime between #Offline.StartDateTime and #Offline.EndDateTime

    --Find Online Times with offline range punched in the middle
    select #Online.Rownum, 
        #Offline.Rownum OfflineRow,
        #Offline.StartDateTime,
        #Offline.EndDateTime,
        ROW_NUMBER() over (Partition by #Online.Rownum order by #Offline.Rownum Desc) OfflineHoleNumber
    into #OfflineHoles
    from #Online
    join #Offline
    on #Offline.StartDateTime between #Online.StartDateTime and #Online.EndDateTime
        and #Offline.EndDateTime between #Online.StartDateTime and #Online.EndDateTime

    declare @HoleNumber as integer
    select @HoleNumber = isnull(MAX(OfflineHoleNumber),0) from #OfflineHoles

    --Punch the holes out of the online times
    While @HoleNumber > 0
    Begin
        insert into #Online 
        select
            -1 Rownum,
            #OfflineHoles.EndDateTime StartDateTime,
            #Online.EndDateTime EndDateTime
        from #Online
        join #OfflineHoles
        on #Online.Rownum = #OfflineHoles.Rownum
        where OfflineHoleNumber = @HoleNumber

        update #Online
        set #Online.EndDateTime = #OfflineHoles.StartDateTime
        from #Online
        join #OfflineHoles
        on #Online.Rownum = #OfflineHoles.Rownum
        where OfflineHoleNumber = @HoleNumber

        set @HoleNumber=@HoleNumber-1
    end

--Output total hours
select SUM(datediff(second,StartDateTime, EndDateTime)) / 3600.0 TotalHr
from #Online

--see how it split up the online intervals
select * 
from #Online
order by StartDateTime, EndDateTime
Brian Pressler
  • 6,653
  • 2
  • 19
  • 40
  • This code should fit exactly what you are doing. You can add the employee number as a key to all the joins. Replace #online with your first table and #offline with your second table. It should digest your entire data set at once. It works well... we've been using it for months without issue. – Brian Pressler Apr 10 '15 at 21:07
2

I've updated my answer with your data example and I'm adding another example for an employee 248 that uses case 2 and 5 from your graph.

--load example data for emply 547
select CONVERT(int, 547) emp_num, 
    Convert(datetime, '2015-4-1') day_date, 
    Convert(datetime, '2015-4-1 08:00') work_st,
    Convert(datetime, '2015-4-1 16:00') work_end, 
    Convert(datetime, '2015-4-1 07:45') check_in, 
    Convert(datetime, '2015-4-1 12:10') check_out, 
    'W' day_state
into #SecondaryIntervals
insert into #SecondaryIntervals select 547, '2015-4-1', '2015-4-1 08:00', '2015-4-1 16:00', '2015-4-1 12:45', '2015-4-1 17:24', 'W'
insert into #SecondaryIntervals select 547, '2015-4-2', '2015-4-2 00:00', '2015-4-2 00:00', '2015-4-2 07:11', '2015-4-2 13:11', 'E'

select CONVERT(int, 547) emp_num, 
    Convert(datetime, '2015-4-1') day_date, 
    Convert(datetime, '2015-4-1 15:00') mission_in,
    Convert(datetime, '2015-4-1 21:30') mission_out
into #MainIntervals
insert into #MainIntervals select 547, '2015-4-2', '2015-4-2 8:00', '2015-4-2 14:00'

--load more example data for an employee 548 with overlapping secondary intervals
insert into #SecondaryIntervals select 548, '2015-4-1', '2015-4-1 06:00', '2015-4-1 11:00', '2015-4-1 9:00', '2015-4-1 10:00', 'W'
insert into #SecondaryIntervals select 548, '2015-4-1', '2015-4-1 06:00', '2015-4-1 11:00', '2015-4-1 10:30', '2015-4-1 12:30', 'W'
insert into #SecondaryIntervals select 548, '2015-4-1', '2015-4-1 06:00', '2015-4-1 11:00', '2015-4-1 13:15', '2015-4-1 16:00', 'W'

insert into #MainIntervals select 548, '2015-4-1', '2015-4-1 8:00', '2015-4-1 14:00'

--Populate your Offline table with the intervals in #SecondaryIntervals
select 
    ROW_NUMBER() over (Order by emp_num, day_date, StartDateTime, EndDateTime) Rownum,
    emp_num,
    day_date,
    StartDateTime, 
    EndDateTime
into #Offline
from 
    (select emp_num,
        day_date,
        work_st StartDateTime, 
        work_end EndDateTime
    from #SecondaryIntervals
    where day_state = 'W'
    Group by emp_num,
        day_date,
        work_st, 
        work_end
    union
    select 
        emp_num,
        day_date,
        check_in StartDateTime, 
        check_out EndDateTime
    from #SecondaryIntervals
    Group by emp_num,
        day_date,
        check_in, 
        check_out
    ) SecondaryIntervals


--Populate your Online table
select 
    ROW_NUMBER() over (Order by emp_num, day_date, mission_in, mission_out) Rownum,
    emp_num,
    day_date,
    mission_in StartDateTime, 
    mission_out EndDateTime
into #Online
from #MainIntervals
group by emp_num,
    day_date,
    mission_in,
    mission_out


-------------------------------
--find overlaping offline times
-------------------------------
declare @Finished as tinyint
set @Finished = 0

while @Finished = 0
Begin
    update #Offline
    set #Offline.EndDateTime = OverlapEndDates.EndDateTime
    from #Offline
    join
        (
        select #Offline.Rownum,
            MAX(Overlap.EndDateTime) EndDateTime
        from #Offline
        join #Offline Overlap
        on #Offline.emp_num = Overlap.emp_num
            and #Offline.day_date = Overlap.day_date
            and Overlap.StartDateTime between #Offline.StartDateTime and #Offline.EndDateTime
            and #Offline.Rownum <= Overlap.Rownum
        group by #Offline.Rownum
        ) OverlapEndDates
    on #Offline.Rownum = OverlapEndDates.Rownum

    --Remove Online times completely inside of online times
    delete #Offline
    from #Offline
    join #Offline Overlap
    on #Offline.emp_num = Overlap.emp_num
        and #Offline.day_date = Overlap.day_date
        and #Offline.StartDateTime between Overlap.StartDateTime and Overlap.EndDateTime
        and #Offline.EndDateTime between Overlap.StartDateTime and Overlap.EndDateTime
        and #Offline.Rownum > Overlap.Rownum

    --LOOK IF THERE ARE ANY MORE CHAINS LEFT    
    IF NOT EXISTS(
            select #Offline.Rownum,
                MAX(Overlap.EndDateTime) EndDateTime
            from #Offline
            join #Offline Overlap
            on #Offline.emp_num = Overlap.emp_num
                and #Offline.day_date = Overlap.day_date
                and Overlap.StartDateTime between #Offline.StartDateTime and #Offline.EndDateTime
                and #Offline.Rownum < Overlap.Rownum
            group by #Offline.Rownum
            )
        SET @Finished = 1
END

-------------------------------
--Modify Online times with offline ranges
-------------------------------

--delete any Online times completely inside offline range
delete #Online 
from #Online
join #Offline
on #Online.emp_num = #Offline.emp_num
    and #Online.day_date = #Offline.day_date
    and #Online.StartDateTime between #Offline.StartDateTime and #Offline.EndDateTime
    and #Online.EndDateTime between #Offline.StartDateTime and #Offline.EndDateTime

--Find Online Times with offline range at the beginning
update #Online
set #Online.StartDateTime = #Offline.EndDateTime
from #Online
join #Offline
on #Online.emp_num = #Offline.emp_num
    and #Online.day_date = #Offline.day_date
    and #Online.StartDateTime between #Offline.StartDateTime and #Offline.EndDateTime
    and #Online.EndDateTime >= #Offline.EndDateTime

--Find Online Times with offline range at the end
update #Online
set #Online.EndDateTime = #Offline.StartDateTime
from #Online
join #Offline
on #Online.emp_num = #Offline.emp_num
    and #Online.day_date = #Offline.day_date
    and #Online.StartDateTime <= #Offline.StartDateTime
    and #Online.EndDateTime between #Offline.StartDateTime and #Offline.EndDateTime

--Find Online Times with offline range punched in the middle
select #Online.Rownum, 
    #Offline.Rownum OfflineRow,
    #Offline.StartDateTime,
    #Offline.EndDateTime,
    ROW_NUMBER() over (Partition by #Online.Rownum order by #Offline.Rownum Desc) OfflineHoleNumber
into #OfflineHoles
from #Online
join #Offline
on #Online.emp_num = #Offline.emp_num
    and #Online.day_date = #Offline.day_date
    and #Offline.StartDateTime between #Online.StartDateTime and #Online.EndDateTime
    and #Offline.EndDateTime between #Online.StartDateTime and #Online.EndDateTime

declare @HoleNumber as integer
select @HoleNumber = isnull(MAX(OfflineHoleNumber),0) from #OfflineHoles

--Punch the holes out of the online times
While @HoleNumber > 0
Begin
    insert into #Online 
    select
        -1 Rownum,
        #Online.emp_num,
        #Online.day_date,
        #OfflineHoles.EndDateTime StartDateTime,
        #Online.EndDateTime EndDateTime
    from #Online
    join #OfflineHoles
    on #Online.Rownum = #OfflineHoles.Rownum
    where OfflineHoleNumber = @HoleNumber

    update #Online
    set #Online.EndDateTime = #OfflineHoles.StartDateTime
    from #Online
    join #OfflineHoles
    on #Online.Rownum = #OfflineHoles.Rownum
    where OfflineHoleNumber = @HoleNumber

    set @HoleNumber=@HoleNumber-1
end

--Output total hours
select emp_num, day_date, 
    SUM(datediff(second,StartDateTime, EndDateTime)) / 3600.0 TotalHr, 
    SUM(datediff(second,StartDateTime, EndDateTime)) / 60.0 TotalMin
from #Online
group by emp_num, day_date
order by 1, 2

--see how it split up the online intervals
select emp_num, day_date, StartDateTime, EndDateTime
from #Online
order by 1, 2, 3, 4

Output is:

emp_num     day_date                TotalHr                                 TotalMin
----------- ----------------------- --------------------------------------- ---------------------------------------
547         2015-04-01 00:00:00.000 4.100000                                246.000000
547         2015-04-02 00:00:00.000 0.816666                                49.000000
548         2015-04-01 00:00:00.000 0.750000                                45.000000

(3 row(s) affected)

emp_num     day_date                StartDateTime           EndDateTime
----------- ----------------------- ----------------------- -----------------------
547         2015-04-01 00:00:00.000 2015-04-01 17:24:00.000 2015-04-01 21:30:00.000
547         2015-04-02 00:00:00.000 2015-04-02 13:11:00.000 2015-04-02 14:00:00.000
548         2015-04-01 00:00:00.000 2015-04-01 12:30:00.000 2015-04-01 13:15:00.000

(3 row(s) affected)

I left my other answer posted because it's more generic in case someone else wants to snag it. I see you added a bounty to this question. Let me know if there's something specific about my answer that doesn't satisfy you and I'll try to help you out. I process thousands of intervals with this method and it returns in just a few seconds.

Brian Pressler
  • 6,653
  • 2
  • 19
  • 40
2

My solution is quite similar to Vladimir Baranov.

Link to .NetFiddle

General idea

My algorithm is based on a modification of interval tree. It assumes that smallest unit of time is 1 minute (easy to modify).

Each tree node is in 1 of 3 state: unvisited, visited and used. The algorithm is based on recursive Search function that can be described by following steps:

  1. If node is used or searching interval is empty then return empty interval.
  2. If node is unvisited and nodes interval equals searching interval then mark current node as used and return node interval.
  3. Mark node as visited, split seraching interval and return sum of Search for left and right children.

Solution in steps

  1. Calculate biggest interval.
  2. Add to tree "secondary intervals".
  3. Add to tree "main interval".
  4. Calculate sum of intervals.

    Please note I assume that intervals are [start; end], i.e. both intervals are inclusive, what is easy to change.

Requirements

Assuming

n - number of "secondary intervals"

m - max time in base unit

Construction requires O(2n) storage space and work in O(n log n + m) time.

Here's my code

  public class Interval
    {
        public int Start { get; set; }

        public int End { get; set; }
    };
    enum Node
    {
        Unvisited = 0,
        Visited = 1,
        Used = 2
    };
    Node[] tree;

    public void Calculate()
    {
        var secondryIntervalsAsDates = new List<Tuple<DateTime,DateTime>> { new Tuple<DateTime, DateTime>( new DateTime(2015, 03, 15, 4, 0, 0), new DateTime(2015, 03, 15, 5, 0, 0))};
        var mainInvtervalAsDate = new Tuple<DateTime, DateTime>(new DateTime(2015, 03, 15, 3, 0, 0), new DateTime(2015, 03, 15, 7, 0, 0));
        // calculate biggest interval
        var startDate = secondryIntervalsAsDates.Union( new List<Tuple<DateTime,DateTime>>{mainInvtervalAsDate}).Min(s => s.Item1).AddMinutes(-1);
        var endDate = secondryIntervalsAsDates.Union(new List<Tuple<DateTime, DateTime>> { mainInvtervalAsDate }).Max(s => s.Item2);
        var mainInvterval = new Interval { Start = (int)(mainInvtervalAsDate.Item1 - startDate).TotalMinutes, End = (int)(mainInvtervalAsDate.Item2 - startDate).TotalMinutes };
        var wholeInterval = new Interval { Start = 1, End = (int)(endDate - startDate).TotalMinutes};
        //convert intervals to minutes
        var secondaryIntervals = secondryIntervalsAsDates.Select(s => new Interval { Start = (int)(s.Item1 - startDate).TotalMinutes, End = (int)(s.Item2 - startDate).TotalMinutes}).ToList();
        tree = new Node[wholeInterval.End * 2 + 1];
        //insert secondary intervals
        secondaryIntervals.ForEach(s => Search(wholeInterval, s, 1));
        //insert main interval
        var result = Search(wholeInterval, mainInvterval, 1);
        //calculate result
        var minutes = result.Sum(r => r.End - r.Start) + result.Count();
    }

    public IEnumerable<Interval> Search(Interval current, Interval searching, int index)
    {
        if (tree[index] == Node.Used || searching.End < searching.Start)
        {
            return new List<Interval>();
        }
        if (tree[index] == Node.Unvisited && current.Start == searching.Start && current.End == searching.End)
        {
            tree[index] = Node.Used;
            return new List<Interval> { current };
        }
        tree[index] = Node.Visited;
        return Search(new Interval { Start = current.Start, End = current.Start + (current.End - current.Start) / 2 },
                  new Interval { Start = searching.Start, End = Math.Min(searching.End, current.Start + (current.End - current.Start) / 2)  }, index * 2).Union(
            Search(new Interval { Start = current.Start + (current.End - current.Start) / 2 + 1 , End = current.End},
              new Interval { Start = Math.Max(searching.Start, current.Start + (current.End - current.Start) / 2 + 1), End = searching.End }, index * 2 + 1));
    }
Marcin J
  • 378
  • 4
  • 15
1

Here is SQLFiddle with complete query.

I will show how I built a query that returns number of minutes for each emp_num, day_date. If it turns out that no minutes are left for a particular emp_num, day_date, then result would not have a row with 0, there will be no such row at all.

General idea

I will use a table of numbers. We'll need only 24*60=1440 numbers, but it is a good idea to have such table in your database for other reports. I personally have it with 100,000 rows. Here is a very good article comparing different methods to generate such table.

For each interval I'm going to generate a set of rows using the table of numbers - one row for each minute in the interval. I assume that intervals are [start; end), i.e. start minute is inclusive, end minute is exclusive. For example, interval from 07:00 to 08:00 is 60 minutes, not 61.

Generate a table of numbers

DECLARE @Numbers TABLE (N int);
INSERT INTO @Numbers(N)
SELECT TOP(24*60)
    ROW_NUMBER() OVER(ORDER BY S.object_id) - 1 AS N
FROM
    sys.all_objects AS S
ORDER BY N
;

For this task it is better to have numbers that start from 0. Normally you would have it as a permanent table with primary key on N.

Sample data

DECLARE @Missions TABLE (emp_num int, day_date datetime, mission_in datetime, mission_out datetime);
DECLARE @Periods TABLE (emp_num int, day_date datetime, work_st datetime, work_end datetime, check_in datetime, check_out datetime, day_state char(1));

INSERT INTO @Missions (emp_num, day_date, mission_in, mission_out) VALUES
(547, '2015-04-01', '2015-04-01 15:00:00', '2015-04-01 21:30:00'),
(547, '2015-04-02', '2015-04-02 08:00:00', '2015-04-02 14:00:00');

INSERT INTO @Periods (emp_num, day_date, work_st, work_end, check_in, check_out, day_state) VALUES
(547, '2015-04-01', '2015-04-01 08:00:00', '2015-04-01 16:00:00', '2015-04-01 07:45:00', '2015-04-01 12:10:00', 'W'),
(547, '2015-04-01', '2015-04-01 08:00:00', '2015-04-01 16:00:00', '2015-04-01 12:45:00', '2015-04-01 17:24:00', 'W'),
(547, '2015-04-02', '2015-04-02 00:00:00', '2015-04-02 00:00:00', '2015-04-02 07:11:00', '2015-04-02 13:11:00', 'E');

My solution will not use the day_state column. I expect that you would have 00:00:00 for both work_st and work_end. Solution expects that the date component within the same row is the same and that day_date doesn't have time component.

If I designed the schema for this task I would have three tables instead of two: Missions, WorkPeriods and CheckPeriods. I would split your table Periods into two to avoid repeating work_st and work_end in several rows. But this solution would deal with your current schema and it will essentially generate this third table on the fly. In practice it means that performance may be improved.

Mission minutes

WITH
CTE_MissionMinutes
AS
(
    SELECT emp_num, day_date, N.N
    FROM
        @Missions AS M
        CROSS JOIN @Numbers AS N
    WHERE
        N.N >= DATEDIFF(minute, M.day_date, M.mission_in) AND
        N.N < DATEDIFF(minute, M.day_date, M.mission_out)
)

Each original row from @Missions turns into a set of rows, one for each minute of the interval (mission_in, mission_out).

Work periods

,CTE_WorkPeriods
AS
(
    SELECT P.emp_num, P.day_date, P.work_st, P.work_end
    FROM @Periods AS P
    GROUP BY P.emp_num, P.day_date, P.work_st, P.work_end
)

Generate a third helper table - one row for each emp_num, day_date, work_st, work_end - all intervals for (work_st, work_end).

Work and Check minutes

,CTE_WorkMinutes
AS
(
    SELECT emp_num, day_date, N.N
    FROM
        CTE_WorkPeriods
        CROSS JOIN @Numbers AS N
    WHERE
        N.N >= DATEDIFF(minute, CTE_WorkPeriods.day_date, CTE_WorkPeriods.work_st) AND
        N.N < DATEDIFF(minute, CTE_WorkPeriods.day_date, CTE_WorkPeriods.work_end)
)
,CTE_CheckMinutes
AS
(
    SELECT emp_num, day_date, N.N
    FROM
        @Periods AS P
        CROSS JOIN @Numbers AS N
    WHERE
        N.N >= DATEDIFF(minute, P.day_date, P.check_in) AND
        N.N < DATEDIFF(minute, P.day_date, P.check_out)
)

Exactly the same as for Missions.

Union "secondary intervals"

,CTE_UnionPeriodMinutes
AS
(
    SELECT emp_num, day_date, N
    FROM CTE_WorkMinutes

    UNION ALL -- can be not ALL here, but ALL is usually faster

    SELECT emp_num, day_date, N
    FROM CTE_CheckMinutes
)

Subtract secondary intervals from primary

,CTE_FinalMinutes
AS
(
    SELECT emp_num, day_date, N
    FROM CTE_MissionMinutes

    EXCEPT

    SELECT emp_num, day_date, N
    FROM CTE_UnionPeriodMinutes
)

Sum up number of minutes

SELECT
    emp_num
    ,day_date
    ,COUNT(*) AS FinalMinutes
FROM CTE_FinalMinutes
GROUP BY emp_num, day_date
ORDER BY emp_num, day_date;

To make the final query just put all CTEs together.

Result set

emp_num day_date                FinalMinutes
547     2015-04-01 00:00:00.000 246
547     2015-04-02 00:00:00.000 49

There are 246 minutes between 17:24 and 21:30.
There are  49 minutes between 13:11 and 14:00.

Here is SQLFiddle with complete query.

It is fairly easy to show actual intervals that lead to this SUM of minutes, but you said you need just the SUM.

Community
  • 1
  • 1
Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90
1

I found out probably the simplest solution.

.netFiddle

  1. Sort "Secondary intervals" by start date.
  2. Look for gaps in "secondary intervals" (simple iteration)
  3. Compare gaps with "main interval".

        //declare intervals
    var secondryIntervals = new List<Tuple<DateTime, DateTime>> {
            new Tuple<DateTime, DateTime>( new DateTime(2015, 03, 15, 4, 0, 0), new DateTime(2015, 03, 15, 5, 0, 0)),
            new Tuple<DateTime, DateTime>( new DateTime(2015, 03, 15, 4, 10, 0), new DateTime(2015, 03, 15, 4, 40, 0)),
            new Tuple<DateTime, DateTime>( new DateTime(2015, 03, 15, 4, 40, 0), new DateTime(2015, 03, 15, 5, 20, 0))};
    var mainInterval = new Tuple<DateTime, DateTime>(new DateTime(2015, 03, 15, 3, 0, 0), new DateTime(2015, 03, 15, 7, 0, 0));
    // add two empty intervals before and after main interval
    secondryIntervals.Add(new Tuple<DateTime, DateTime>(mainInterval.Item1.AddMinutes(-1), mainInterval.Item1.AddMinutes(-1)));
    secondryIntervals.Add(new Tuple<DateTime, DateTime>(mainInterval.Item2.AddMinutes(1), mainInterval.Item2.AddMinutes(1)));
    secondryIntervals = secondryIntervals.OrderBy(s => s.Item1).ToList();
    // endDate will rember 'biggest' end date
    var endDate = secondryIntervals.First().Item1;
    var result = secondryIntervals.Select(s =>
    {
        var temp = endDate;
        endDate = endDate < s.Item2 ? s.Item2 : endDate;
        if (s.Item1 > temp)
        {
            return new Tuple<DateTime, DateTime>(temp < mainInterval.Item1 ? mainInterval.Item1 : temp,
                                                 mainInterval.Item2 < s.Item1 ? mainInterval.Item2 : s.Item1);
        }
        return null;
    })
        // remove empty records
                    .Where(s => s != null && s.Item2 > s.Item1).ToList();
    var minutes = result.Sum(s => (s.Item2 - s.Item1).TotalMinutes);
    

The algorithm requires O(n log n) time (for sorting) without additional storage and assumptions.

Marcin J
  • 378
  • 4
  • 15