13

I have a table with the following definition

CREATE TABLE mytable
  (
     id     INT IDENTITY(1, 1) PRIMARY KEY,
     number BIGINT,
     status INT
  )

and example data

INSERT INTO mytable
VALUES (100,0),
       (101,0),
       (102,0),
       (103,0),
       (104,1),
       (105,1),
       (106,0),
       (107,0),
       (1014,0),
       (1015,0),
       (1016,1),
       (1017,0)

Looking only at the rows where status = 0 how can I collapse the Number values into ranges of contiguous sequential numbers and find the start and end of each range?

i.e. For the example data the results would be

         FROM      to 
Number    100      103
Number    106      107
Number    1014     1015
Number    1017     1017
Martin Smith
  • 438,706
  • 87
  • 741
  • 845
Mike Station
  • 193
  • 1
  • 8
  • I've done this before, months ago. I don't think I saved the query, but it might still be in my local logs. Hold tight. – 000 Jun 11 '13 at 14:10
  • 3
    google for "sql server gaps and islands" – OzrenTkalcecKrznaric Jun 11 '13 at 14:13
  • sorry i can't find it. – 000 Jun 11 '13 at 14:13
  • Oh wow there's an entire tag on SO for this question: http://stackoverflow.com/questions/tagged/gaps-and-islands – 000 Jun 11 '13 at 14:14
  • SQL is very inefficient at looking at data across rows without a cursor (and cursors are bad!). I strongly recommend you coming up with a way to either store the ranges in another collection, or pull down all the data and crunch the numbers (i.e. calculate the ranges) yourself. With that said, you can likely get away with using a recursive CTE to get the job done. How exactly, I'm not sure. As suggested, check out gaps and islands. – Eli Gassert Jun 11 '13 at 14:16
  • come on genius, I know u can find the solution ;) – Mike Station Jun 11 '13 at 14:25
  • 3
    @EliGassert This is entirely untrue. "Gaps and Islands" problems can be solved very efficiently in SQL with resorting to either Cursors or the even more iniefficient Recursive CTEs. Itzak Ben-Gan has written extensively on how this can be done. Here: http://www.sqlmag.com/article/tsql3/calculating-concurrent-sessions-part-3 – RBarryYoung Jun 11 '13 at 14:37

1 Answers1

33

As mentioned in the comments this is a classic gaps and islands problem.

A solution popularized by Itzik Ben Gan is to use the fact that DENSE_RANK() OVER (ORDER BY number) - number remains constant within an "island" and cannot appear in multiple islands.

WITH T
     AS (SELECT DENSE_RANK() OVER (ORDER BY number) - number AS Grp,
                number
         FROM   mytable
         WHERE  status = 0)
SELECT MIN(number) AS [From],
       MAX(number) AS [To]
FROM   T
GROUP  BY Grp
ORDER  BY MIN(number);

On later versions my preference is to use LAG/LEAD for this to reduce the amount of sorting going on. The below should work from 2012+

WITH T AS
(
SELECT *,
       MinNumber  = MIN(number) OVER (ORDER BY number ROWS UNBOUNDED PRECEDING),
       NextNumber = LEAD(number) OVER (ORDER BY number)
FROM mytable
WHERE status = 0
)
SELECT LAG(NextNumber, 1, MinNumber) OVER (ORDER BY number) AS [From], 
       number AS [To]
FROM T
WHERE NextNumber IS NULL OR NextNumber <> number+ 1
Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • +1: Yes, this is the best approach. No cursed Cursors or inefficent recursion necessary. – RBarryYoung Jun 11 '13 at 14:39
  • 3
    Here's one of the places that Itzak talks about it: http://www.sqlmag.com/article/tsql3/calculating-concurrent-sessions-part-3. I won't mention who he credits with solving it (besides Ben Flanaghan and Arnold Fribble). ;-) – RBarryYoung Jun 11 '13 at 14:46
  • @RBarryYoung - Very modest! Apologies for crediting the wrong person in my answer! – Martin Smith Jun 11 '13 at 14:51
  • @MartinSmith Yeah, my one claim to fame. :-) Besides, Itzak Ben-Gan really is a genius IMHO. – RBarryYoung Jun 11 '13 at 14:53
  • 1
    why did you user the - in your query ? – Mike Station Jun 12 '13 at 10:06
  • 2
    @MikeStation - As opposed to what? The `GRP` value needs to be something constant within an island and unique to an island that achieves it. e.g. See the results here and experiment with changing the example data and you should see how it works. http://sqlfiddle.com/#!6/f1040/2 – Martin Smith Jun 12 '13 at 10:12