This is a fairly difficult problem to solve with SQL due to the need to find transitions and ranges in the data, which is not trivial. I've broken the problem down into a series of steps made of successive cte's that build on one another and lead up to a final working solution:
First, I add a row index to the data to provide a simple PK for identifying a unique row:
with NumberedAtt as (
select
row_number() over (partition by [user] order by date, time, reader) as ix,
att.[user],
att.[department],
att.[date] + att.[time] as dt,
att.[reader]
from att
)
Then I grab the first and last index value per user which will be used for the outermost boundaries of each entry/exit range:
, MinMax as (
select [user], min(ix) ixMin, max(ix) ixMax
from NumberedAtt N group by [user]
)
Next I put these together to generate a list of all exit/entry ranges, which are the points where the value of Reader
changes from 2
to 1
. These are the specific points that exactly identify when a previous time range ends, and the next time range begins (and cleanly handles successive duplicate entry/exit reads). By combining this with the first entry and last exit time for each user, a list of all entry/exit transitions is generated:
, Transitions as (
select N.[User], 0 as exitIx, M.ixMin as entryIx
from NumberedAtt N
join MinMax M on N.[User] = M.[User]
where N.ix = M.ixMin
union
select N.[User], M.ixMax as exitIx, 0 as entryIx
from NumberedAtt N
join MinMax M on N.[User] = M.[User]
where N.ix = M.ixMax
union
select A1.[User], A1.ix as exitIx, A2.ix as entryIx
from NumberedAtt A1
join NumberedAtt A2 on A1.ix + 1 = A2.ix and A1.[user] = A2.[user]
where A1.[reader] = 2 and A2.[reader] = 1
)
Here is the output at this point:
| USER | EXITIX | ENTRYIX |
|------|--------|---------|
| A1 | 0 | 1 |
| A1 | 2 | 3 |
| A1 | 4 | 0 |
| B1 | 0 | 1 |
| B1 | 3 | 0 |
Notice that we've neatly captured all of the row indexes where a range of time begins and ends. However, they are offset by one - that is the entry time in one row corresponds to the exit time in the next row. So we need one more transformation to bring the ranges back together by adding a row index to this table and joining each row with the following row:
, NumberedTransitions as (
select
row_number() over (partition by [User] order by exitIx) tix,
T.*
from Transitions T
), EntryExit as (
select
aEntry.ix as ixEntry,
aExit.ix as ixExit,
aEntry.[user],
aEntry.[department],
aEntry.[dt] as entryDT,
aExit.[dt] as exitDT
from NumberedTransitions tEntry
join NumberedAtt aEntry on tEntry.entryIx = aEntry.ix and tEntry.[user] = aEntry.[user]
join NumberedTransitions tExit on tEntry.tix + 1 = tExit.tix and tEntry.[user] = tExit.[user]
join NumberedAtt aExit on tExit.exitIx = aExit.ix and tExit.[user] = aExit.[user]
)
After joining the successive ranges together, I also pull in the original detail data back in, since I've been working only with the row index values so far. At the end of this subquery, we have identified all the entry/exit ranges per user and "swallowed up" any multiple reads:
| IXENTRY | IXEXIT | USER | DEPARTMENT | ENTRYDT | EXITDT |
|---------|--------|------|------------|------------------------------|------------------------------|
| 1 | 2 | A1 | IT | March, 01 2014 11:12:00+0000 | March, 01 2014 13:12:00+0000 |
| 3 | 4 | A1 | IT | March, 02 2014 07:42:15+0000 | March, 02 2014 16:16:15+0000 |
| 1 | 3 | B1 | IT | March, 01 2014 12:28:06+0000 | March, 01 2014 13:28:06+0000 |
Now the only thing left to do is group the data together to report on total hours per user, per month. It is a little tricky to calculate the total hours, but it can be done by taking the sum of minutes between the ranges and then converting the result back into a time value:
, Hours as (
select
[User],
[Department],
Year(EntryDT) Year,
Month(EntryDT) Month,
RIGHT('0' + CAST(SUM(DATEDIFF(Minute, EntryDT, ExitDT)) / 60 as varchar(10)), 2) + ':' +
RIGHT('0' + CAST(SUM(DATEDIFF(Minute, EntryDT, ExitDT)) % 60 as varchar(2)), 2) as TotalHours
from EntryExit EE
group by [User], [Department], Year(EntryDT), Month(EntryDT)
)
This gives a final result which is very close to the desired result:
| USER | DEPARTMENT | YEAR | MONTH | TOTALHOURS |
|------|------------|------|-------|------------|
| A1 | IT | 2014 | 3 | 10:34:00 |
| B1 | IT | 2014 | 3 | 01:00:00 |
A few tweaks could be made to the formatting as desired, but that should be easy to build on top of this framework.
Here is a working demo: http://www.sqlfiddle.com/#!3/f3f37/7