5

I'm having difficulty with a consecutive rows query, and need some assistance.

I have a table that is part of a telephony switch. It tracks every employee who logs into a phone (and the phones extension). I have another table that tells me where that phone is (across multiple sites).

The logic we're going for is if someone logs into a particular phone/site for 2 consecutive days, we'll trigger an event to update their employee record to make this site their primary location.

The issue I'm having with my partitioning is that even if the data isn't truly consecutive, my count is increasing.

If you consider the following table:

row_date    logid   extn    duration    sitecode    daysconsecutive
----------------------------------------------------------------------------    
2014-05-22  500001  414128  9.82    W   1
2014-05-27  500001  414120  10.74   W   2
2014-05-28  500001  414149  5.47    W   3
2014-05-28  500001  414126  6.18    W   4
2014-05-29  500001  414128  11.80   W   5
2014-05-30  500001  414128  2.21    W   6
2014-05-30  500001  414150  5.47    N   1
2014-05-31  500001  414128  4.57    W   7
2014-06-02  500001  414150  4.94    N   2

As you can see on 5/30 this person moved from site W to N. Then on 5/31 they went back to site W -- but this row should be a 1, not a 7, as it's not contiguous with the 6 consecutive days he was at site W from 5/22 - 5/30.

Once I have this ironed out, I'll select the max entry where daysconsecutive>=2 and use that to fire the update.

Any help you can offer would be appreciated.

This is for SQL Server 2008 R2.

Cheers.

Fiddle here.

create table haglog (row_date datetime, logid int, extn int, duration decimal(10,2), sitecode varchar(10))

insert into haglog 
select '2014-05-22',500001,414128,9.82,'W' union all
select '2014-05-27',500001,414120,10.74,'W' union all
select '2014-05-28',500001,414149,5.47,'W' union all
select '2014-05-28',500001,414126,6.18,'W' union all
select '2014-05-29',500001,414128,11.80,'W' union all
select '2014-05-30',500001,414128,2.21,'W' union all
select '2014-05-30',500001,414150,5.47,'N' union all
select '2014-05-31',500001,414128,4.57,'W' union all
select '2014-06-02',500001,414150,4.94,'N'

;with consecutivecte
AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY logid, sitecode ORDER BY row_date) AS daysconsecutive
FROM haglog
)

select * 
from consecutivecte
where logid=500001
order by row_date
Anon
  • 10,660
  • 1
  • 29
  • 31
Dave C
  • 7,272
  • 1
  • 19
  • 30

2 Answers2

1

You can use a gaps-and-islands technique to isolate runs of consecutive days. If you have two rowcounts, one partitioned by sitecode and one unpartitioned, then the difference between the two is invariant within a consecutive run of the same sitecode. This can be used to distinguish between different runs and generate a running count for each.

WITH
  t1 AS (
    SELECT *,
      ROW_NUMBER() OVER(PARTITION BY logid, sitecode ORDER BY row_date)
      - ROW_NUMBER() OVER(PARTITION BY logid ORDER BY row_date) consecutive_group_id
    FROM @haglog
  ),
  t2 AS (
    SELECT *,
      ROW_NUMBER() OVER(PARTITION BY logid, sitecode, consecutive_group_id ORDER BY row_date) daysconsecutive
    FROM t1
  )
SELECT row_date, logid, extn, duration, sitecode, daysconsecutive
FROM t2
ORDER BY row_date
Anon
  • 10,660
  • 1
  • 29
  • 31
  • This works great. I added a rowID into t2 (`ROW_NUMBER() OVER(ORDER BY row_date,logid) as rowID`) and joined back to the max(rowID) group by logid so I only use the most recent event, and it works perfectly. Fantastic, thanks again! – Dave C Jun 03 '14 at 17:51
  • If need to isolate the most recent event, you can avoid the self-join with `ORDER BY row_date DESC` in the ROW_NUMBER() and select where it = 1 – Anon Jun 03 '14 at 18:02
  • True, but then I can't do where daysconsecutive>=2 since the numbering is inverted. I guess I could add another column with the inverted ordering.. :) – Dave C Jun 03 '14 at 18:07
0

I would use cross apply (select top(1) ...) instead of CTE to query previous record for each record in dataset. And limit it by one last day.

So you will get say yesterday records and for each of them previous records by the same user.

select h.* , t.* 
from haglog h
    cross apply ( select top(1) * from haglog t where ...) as t
where logid=500001 and date = @yesterday
vittore
  • 17,449
  • 6
  • 44
  • 82
  • Thank you for replying, but I don't think this approach will give the flexibility required should the business rules change to 3 consecutive days. I'll also not be limiting to a specific employee id or date. Also, the dates don't necessarily have to be back to back, it could be a Friday login and Monday login and that would count as 2 days in a row. – Dave C Jun 03 '14 at 15:46
  • @JiggsJedi last two conditions you mentioned can be implemented with such approach just fine. – vittore Jun 03 '14 at 15:48