5

T-SQL, SQL Server 2008 and up

Given a sample table of

 StatusSetDateTime   | UserID | Status    | StatusEndDateTime   | StatusDuration(in seconds)
============================================================================
 2012-01-01 12:00:00 | myID   | Available | 2012-01-01 13:00:00 | 3600

I need to break that down into a view that uses 15 minute intervals for example:

IntervalStart       | UserID | Status | Duration

===========================================

2012-01-01 12:00:00 | myID | Available | 900 

2012-01-01 12:15:00 | myID | Available | 900

2012-01-01 12:30:00 | myID | Available | 900 

2012-01-01 12:45:00 | myID | Available | 900 

2012-01-01 13:00:00 | myID | Available | 0

etc....

Now I've been able to search around and find some queries that will break down I found something similar for MySql Here :

And something for T-SQL Here

But on the second example they are summing the results whereas I need to divide the total duration by the interval time (900 seconds) by user by status.

I was able to adapt the examples in the second link to split everything into intervals but the total duration time is returned and I cannot quite figure out how to get the Interval durations to split (and still sum up to the total original duration).

Thanks in advance for any insight!

edit : First Attempt

 ;with cte as 
    (select MIN(StatusDateTime) as MinDate
          , MAX(StatusDateTime) as MaxDate
          , convert(varchar(14),StatusDateTime, 120) as StartDate
          , DATEPART(minute, StatusDateTime) /15 as GroupID
          , UserID
          , StatusKey
          , avg(StateDuration) as AvgAmount
     from AgentActivityLog
     group by convert(varchar(14),StatusDateTime, 120)
         , DATEPART(minute, StatusDateTime) /15
         , Userid,StatusKey)

  select dateadd(minute, 15*GroupID, CONVERT(datetime,StartDate+'00'))
         as [Start Date]
       , UserID, StatusKey, AvgAmount as [Average Amount]
  from cte

edit : Second Attempt

;With cte As
   (Select DateAdd(minute
                   , 15 * (DateDiff(minute, '20000101', StatusDateTime) / 15)
                   , '20000101') As StatusDateTime
         , userid, statuskey, StateDuration
    From AgentActivityLog)

 Select StatusDateTime, userid,statuskey,Avg(StateDuration)
 From cte
 Group By StatusDateTime,userid,statuskey;
yassadi
  • 524
  • 1
  • 9
  • 20
Wjdavis5
  • 3,952
  • 7
  • 35
  • 63
  • In your example, you have splitted into 5 rows and 5 * 900 != 3600 or am I missing something? – Faruk Sahin Nov 30 '12 at 16:05
  • 1
    Does your first interval always begin at the time the StatusSetDateTime, or is it always based on :00, :15, :30, :45? Using your example, if StatusSetDateTime came in as 12:06:30, would your first interval start at 12;00:00 or 12:06:30? Also, can you post the SQL you have so far for splitting? – Data Masseur Nov 30 '12 at 16:11
  • @FarukSahin Sorry I meant to put 0 in the final interval thank for catching that. The intervals will always be 00, 15, 30, 45 – Wjdavis5 Nov 30 '12 at 16:13
  • Personally, `StatusEndDateTime` (really?) is _exclusive_ - the status you have at `13:00` is **not** `Available`, it's something else (unknown, or potentially nothing). It would, after all, suggest that you're available for an additional 15 minutes, which may not actually be the case. – Clockwork-Muse Nov 30 '12 at 17:17
  • Sorry if I'm repeating, it's not clear: will the StatusSetDatetime *always* end in :00, :15, :30, or :45? Without any seconds portion? And, are you SURE about including a final row with 0 duration? What value or business meaning does a 0 duration row actually have? – ErikE Nov 30 '12 at 19:54
  • @ErikE Sorry for the confusion - StatusSetDateTime does not always end on the 15 min interval. But we need to always round each StatusSetDateTime Down to the interval. I"m not sure what you mean by 0 duration rows though. – Wjdavis5 Dec 03 '12 at 14:33
  • This row in your question: `2012-01-01 13:00:00 | myID | Available | 0` has a Duration column value of 0. Do you really need those? They make coding everything harder and, being 0, have no meaning. Also, does "round each StatusSetDateTime down" mean treat it as if it WERE starting at the interval? Just quantize the input dates to the nearest 15 minutes, before doing any calculation at all? Note that this can prevent your sums from matching. And what if two events start and stop in the same 15 minutes? – ErikE Dec 03 '12 at 18:59

5 Answers5

4
;with cte_max as 
(
   select dateadd(mi, -15, max(StatusEndDateTime)) as EndTime, min(StatusSetDateTime) as StartTime
   from AgentActivityLog
), times as
(
    select StartTime as Time from cte_max
    union all
    select dateadd(mi, 15, c.Time)
    from times as c
        cross join cte_max as cm
    where c.Time <= cm.EndTime
)
select
    t.Time, A.UserID, A.Status,
    case
        when t.Time = A.StatusEndDateTime then 0
        else A.StatusDuration / (count(*) over (partition by A.StatusSetDateTime, A.UserID, A.Status) - 1)
    end as Duration
from AgentActivityLog as A
    left outer join times as t on t.Time >= A.StatusSetDateTime and t.Time <= A.StatusEndDateTime

sql fiddle demo

Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
  • Thanks for the help - I receive a maxrecursion error when running this in SSMS, seems to work ok in SQL Fiddle. Only problem there is that for myID2 - unavailable, you have a duration of 320 even though they were in that state for the entire 15 minutes, which should be 900 – Wjdavis5 Nov 30 '12 at 17:54
  • yes, ok, it's because I've added a wrond duration - 1600 when it should be 4500. It seems that duration field is redundant at all, you can write 900 instead of `A.StatusDuration / (count(*) over (partition by A.StatusSetDateTime, A.UserID, A.Status) - 1)` and remove duration from your table :) http://sqlfiddle.com/#!3/8c921/1 – Roman Pekar Nov 30 '12 at 18:18
  • @Wjdavis5 I also have super simple solution for you if difference between start and end date always 1 hour – Roman Pekar Nov 30 '12 at 18:19
  • THanks again for working on this, My durations are not always 1 hour that was just an easy example. – Wjdavis5 Nov 30 '12 at 18:52
3

I've never been comfortable with using date math to split things up into partitions. It seems like there are all kinds of pitfalls to fall into.

What I prefer to do is to create a table (pre-defined, table-valued function, table variable) where there's one row for each date partition range. The table-valued function approach is particularly useful because you can build it for arbitrary ranges and partition sizes as you need. Then, you can join to this table to split things out.

paritionid starttime     endtime
---------- ------------- -------------
1          8/1/2012 5:00 8/1/2012 5:15
2          8/1/2012 5:15 8/1/2012 5:30
...

I can't speak to the performance of this method, but I find the queries are much more intuitive.

ErikE
  • 48,881
  • 23
  • 151
  • 196
quillbreaker
  • 6,119
  • 3
  • 29
  • 47
  • P.S. You don't have to say "pitfalls to fall into". "pitfalls" is enough by itself! What else does one do with a pitfall but, well, fall into it? :) My grandmother used to say "Are you going with?" instead of "Are you going?", which is a similar thing. – ErikE Nov 30 '12 at 20:02
  • Whenever someone edits a work draft of mine, they cut 5% to 10% of the words. If I believed in past lives, I would believe that in a past life, I was paid by the word. Forsooth. – quillbreaker Nov 30 '12 at 23:49
  • I hope you understood I was enjoying the language thingy, not trying to radically criticize. :) – ErikE Nov 30 '12 at 23:54
1

It is relatively simple if you have a helper table with every 15-minute timestamp, which you JOIN to your base table via BETWEEN. You can build the helper table on the fly or keep it permanently in your database. Simple for the next guy at your company to figure out too:

// declare a table and a timestamp variable
declare @timetbl table(t datetime)
declare @t datetime

// set the first timestamp
set @t = '2012-01-01 00:00:00'

// set the last timestamp, can easily be extended to cover many years
while @t <= '2013-01-01'
begin
    // populate the table with a new row, every 15 minutes
    insert into @timetbl values (@t)
    set @t = dateadd(mi, 15, @t)
end


// now the Select query:
select 
   tt.t, aal.UserID, aal.Status,
   case when aal.StatusEndDateTime <= tt.t then 0 else 900 end as Duration
   // using a shortcut for Duration, based on your comment that Start/End are always on the quarter-hour, and thus always 900 seconds or zero

from 
   @timetbl tt 
      INNER JOIN AgentActivityLog aal 
         on tt.t between aal.StatusSetDateTime and aal.StatusEndDateTime

order by
  aal.UserID, tt.t
ExactaBox
  • 3,235
  • 16
  • 27
0

You can use a recursive Common Table Expression, where you keep adding your duration while the StatusEndDateTime is greater than the IntervalStart e.g.

;with cte as (
    select StatusSetDateTime as IntervalStart
        ,UserID
        ,Status
        ,StatusDuration/(datediff(mi, StatusSetDateTime, StatusEndDateTime)/15) as Duration
        , StatusEndDateTime
    From AgentActivityLog
    Union all
    Select DATEADD(ss, Duration, IntervalStart) as IntervalStart
        , UserID
        , Status
        , case when DATEADD(ss, Duration, IntervalStart) = StatusEndDateTime then 0 else Duration end as Duration
        , StatusEndDateTime
    From cte
    Where IntervalStart < StatusEndDateTime
)

select IntervalStart, UserID, Status, Duration from cte
Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
Matti John
  • 19,329
  • 7
  • 41
  • 39
0

Here's a query that will do the job for you without requiring helper tables. (I have nothing against helper tables, they are useful and I use them. It is also possible to not use them sometimes.) This query allows for activities to start and end at any times, even if not whole minutes ending in :00, :15, :30, :45. If there will be millisecond portions then you'll have to do some experimenting because, following your model, I only went to second resolution.

If you have a known hard maximum duration, then remove @MaxDuration and replace it with that value, in minutes. N <= @MaxDuration is crucial to the query performing well.

DECLARE @MaxDuration int;
SET @MaxDuration = (SELECT Max(StatusDuration) / 60 FROM #AgentActivityLog);

WITH
L0 AS(SELECT 1 c UNION ALL SELECT 1),
L1 AS(SELECT 1 c FROM L0, L0 B),
L2 AS(SELECT 1 c FROM L1, L1 B),
L3 AS(SELECT 1 c FROM L2, L2 B),
L4 AS(SELECT 1 c FROM L3, L3 B),
L5 AS(SELECT 1 c FROM L4, L4 B),
Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) n FROM L5)
SELECT
   S.IntervalStart,
   Duration = DateDiff(second, S.IntervalStart, E.IntervalEnd)
FROM
   #AgentActivityLog L
   CROSS APPLY (
      SELECT N, Offset = (N.N - 1) * 900
      FROM Nums N
      WHERE N <= @MaxDuration
   ) N
   CROSS APPLY (
      SELECT Edge =
         DateAdd(second, N.Offset, DateAdd(minute,
            DateDiff(minute, '20000101', L.StatusSetDateTime)
            / 15 * 15, '20000101')
         )
   ) G
   CROSS APPLY (
      SELECT IntervalStart = Max(T.BeginTime)
      FROM (
         SELECT L.StatusSetDateTime
         UNION ALL SELECT G.Edge
      ) T (BeginTime)
   ) S
   CROSS APPLY (
      SELECT IntervalEnd = Min(T.EndTime)
      FROM (
         SELECT L.StatusEndDateTime
         UNION ALL SELECT G.Edge + '00:15:00'
      ) T (EndTime)
   ) E
WHERE
   N.Offset <= L.StatusDuration
ORDER BY
   L.StatusSetDateTime,
   S.IntervalStart;

Here is setup script if you want to try it:

CREATE TABLE #AgentActivityLog (
    StatusSetDateTime datetime,
    StatusEndDateTime datetime,
    StatusDuration AS (DateDiff(second, 0, StatusEndDateTime - StatusSetDateTime))
);

INSERT #AgentActivityLog -- weird end times
SELECT '20120101 12:00:00', '20120101 13:00:00'
UNION ALL SELECT '20120101 13:00:00', '20120101 13:27:56'
UNION ALL SELECT '20120101 13:27:56', '20120101 13:28:52'
UNION ALL SELECT '20120101 13:28:52', '20120120 11:00:00'

INSERT #AgentActivityLog -- 15-minute quantized end times
SELECT '20120101 12:00:00', '20120101 13:00:00'
UNION ALL SELECT '20120101 13:00:00', '20120101 13:30:00'
UNION ALL SELECT '20120101 13:30:00', '20120101 14:00:00'
UNION ALL SELECT '20120101 14:00:00', '20120120 11:00:00'

Also, here's a version that expects ONLY times that have whole minutes ending in :00, :15, :30, or :45.

DECLARE @MaxDuration int;
SET @MaxDuration = (SELECT Max(StatusDuration) / 60 FROM #AgentActivityLog);

WITH
L0 AS(SELECT 1 c UNION ALL SELECT 1),
L1 AS(SELECT 1 c FROM L0, L0 B),
L2 AS(SELECT 1 c FROM L1, L1 B),
L3 AS(SELECT 1 c FROM L2, L2 B),
L4 AS(SELECT 1 c FROM L3, L3 B),
L5 AS(SELECT 1 c FROM L4, L4 B),
Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) n FROM L5)
SELECT
   S.IntervalStart,
   Duration = CASE WHEN Offset = StatusDuration THEN 0 ELSE 900 END
FROM
   #AgentActivityLog L
   CROSS APPLY (
      SELECT N, Offset = (N.N - 1) * 900
      FROM Nums N
      WHERE N <= @MaxDuration
   ) N
   CROSS APPLY (
      SELECT IntervalStart = DateAdd(second, N.Offset, L.StatusSetDateTime)
   ) S
WHERE
   N.Offset <= L.StatusDuration   
ORDER BY
   L.StatusSetDateTime,
   S.IntervalStart;

It really seems like having the final 0 Duration row is not correct, because then you can't just order by IntervalStart as there are duplicate IntervalStart values. What is the benefit of having rows that add 0 to the total?

ErikE
  • 48,881
  • 23
  • 151
  • 196