1

I have a table with columns: name, start date (a date value) and finish date(a date value). I want to group by name adding up the dates so I get the total time with no collisions. So, if I have a table

name | start date | finish date
===============================
a    | 20/10/2015 | 22/10/2015
a    | 21/10/2015 | 22/10/2015
a    | 26/10/2015 | 27/10/2015

So, if I group by name, the 3 rows will aggregate, if I simply add the DATEDIFF day per row I'll get 4, if I calculate the DATEDIFF between the MIN start date and the MAX finish date it will be 7, when in reality the right answer would be 3, since the second row collides with the first one and I only need to count that time once.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Oscar Vasquez
  • 465
  • 2
  • 6
  • 14
  • Is this even possible in SQL? I don't know much about SQL but I suspect it might not be possible – Oscar Vasquez Dec 11 '15 at 00:00
  • This is actually more complicated that it might appear at first glance. Check out these possibly duplicate questions: http://stackoverflow.com/questions/34072016/contiguous-dates and http://stackoverflow.com/questions/29549117/how-to-make-calculation-on-time-intervals – Brian Pressler Dec 11 '15 at 00:11
  • what database are you using? – cha Dec 11 '15 at 00:14

1 Answers1

0

Thanks for your comments below. I have used a completely different approach. First L build a calendar CTE a with all the dates that exist in your table. You may use an existing calendar table from your database if you have one. Then in the CTE b I CROSS JOIN the calendar CTE to get the dates that exist for the date ranges. In this CTE it does not matter how many overlapping ranges you have as The date will be included once only using the GROUP BY [name] clause. And now all you need to do is to count the number of the individual dates in the CTE c:

SQL Fiddle

MS SQL Server 2008 Schema Setup:

CREATE TABLE Table1
    ([name] varchar(1), [start date] datetime, [finish date] datetime)
;

INSERT INTO Table1
    ([name], [start date], [finish date])
VALUES
    ('a', '2015-10-20 00:00:00', '2015-10-22 00:00:00'),
    ('a', '2015-10-21 00:00:00', '2015-10-22 00:00:00'),
    ('a', '2015-10-21 00:00:00', '2015-10-23 00:00:00'),
    ('a', '2015-10-26 00:00:00', '2015-10-27 00:00:00')
;

Query 1:

with dt as(
  select min([start date]) as sd, max([finish date]) as fd from Table1
),
a as (
  select sd from dt
  union all 
  select dateadd(day, 1, a.sd)
  FROM a cross join dt
  where a.sd < fd
),
b as(
  select [name], sd
  from table1 cross join a where a.sd between [start date] and [finish date]
  group by [name], sd
),
c as (
  select [name], count(*) days from b group by [name]
)
select * from c
option (maxrecursion 0)

Results:

| name | days |
|------|------|
|    a |    6 |
cha
  • 10,301
  • 1
  • 18
  • 26