7

I have a table

T (variable_name, start_no, end_no) 

that holds values like:

(x, 10, 20)
(x, 30, 50)
(x, 60, 70)
(y, 1, 3)
(y, 7, 8)

All intervals are guaranteed to be disjoint.

I want to write a query in T-SQL that computes the intervals where a variable is not searched:

(x, 21, 29)
(x, 51, 59)
(y, 4, 6)

Can I do this without a cursor?

I was thinking of partitioning by variable_name and then ordering by start_no. But how to proceed next? Given the current row in the rowset, how to access the "next" one?

cloudy
  • 73
  • 3

5 Answers5

8

Since you didn't specify which version of SQL Server, I have multiple solutions. If you have are still rocking SQL Server 2005, then Giorgi's uses CROSS APPLY quite nicely.

Note: For both solutions, I use the where clause to filter out improper values so even if the the data is bad and the rows overlap, it will ignore those values.

My Version of Your Table

DECLARE @T TABLE (variable_name CHAR, start_no INT, end_no INT) 
INSERT INTO @T
VALUES  ('x', 10, 20),
        ('x', 30, 50),
        ('x', 60, 70),
        ('y', 1, 3),
        ('y', 7, 8);

Solution for SQL Server 2012 and Above

SELECT *
FROM
(
    SELECT  variable_name,
            LAG(end_no,1) OVER (PARTITION BY variable_name ORDER BY start_no) + 1 AS start_range,
            start_no - 1 AS end_range
    FROM @T
) A
WHERE end_range > start_range

Solution for SQL 2008 and Above

WITH CTE
AS
(
    SELECT  ROW_NUMBER() OVER (PARTITION BY variable_name ORDER BY start_no) row_num,
            *
    FROM @T
)

SELECT  A.variable_name,
        B.end_no + 1 AS start_range,
        A.start_no - 1 AS end_range
FROM CTE AS A
INNER JOIN CTE AS B
ON      A.variable_name = B.variable_name
    AND A.row_num = B.row_num + 1
WHERE A.start_no - 1 /*end_range*/ > B.end_no + 1 /*start_range*/
Stephan
  • 5,891
  • 1
  • 16
  • 24
3

Here is another version with cross apply:

DECLARE @t TABLE ( v CHAR(1), sn INT, en INT )
INSERT  INTO @t
VALUES  ( 'x', 10, 20 ),
        ( 'x', 30, 50 ),
        ( 'x', 60, 70 ),
        ( 'y', 1, 3 ),
        ( 'y', 7, 8 );

SELECT t.v, t.en + 1, c.sn - 1 FROM @t t
CROSS APPLY(SELECT TOP 1 * FROM @t WHERE v = t.v AND sn > t.sn ORDER BY sn)c
WHERE t.en + 1 < c.sn

Fiddle http://sqlfiddle.com/#!3/d6458/3

Giorgi Nakeuri
  • 35,155
  • 8
  • 47
  • 75
  • 1
    One issue with this. What if there is no gap? Lets say the second row is ('x',21,50). Then you will return a row with ('x',21,20) which is improper. You should add a where clause **WHERE t.en + 1 < c.sn - 1** – Stephan Jun 16 '15 at 15:38
  • Excellent, thank you! I didn't mark yours as t-h-e answer since indeed I didn't specify the SQL Server version in initial post. – cloudy Jun 17 '15 at 06:16
0

For each end_no you should find the nearest start_no > end_no then exclude rows without nearest start_no (last rows for the variable_name)

WITH A AS
(
SELECT variable_name, end_no+1 as x1, 
              (SELECT MIN(start_no)-1 FROM t 
                     WHERE t.variable_name = t1.variable_name
                     AND t.start_no>t1.end_no) as x2
FROM t as t1 )
SELECT * FROM A WHERE x2 IS NOT NULL 
ORDER BY variable_name,x1

SQLFiddle demo

Also here is my old answer to the similar question:

Allen's Interval Algebra operations in SQL

Community
  • 1
  • 1
valex
  • 23,966
  • 7
  • 43
  • 60
0

This is very portable as it doesn't require CTEs or analytic functions. I could also easily be rewritten without the derived table if that were ever necessary.

select * from (
    select
        variable_name,
        end_no + 1 as start_no,
        (
            select min(start_no) - 1
            from T as t2
            where t2.variable_name = t1.variable_name and t2.start_no > t1.end_no
        ) as end_no
    from T as t1
) as intervals
where start_no <= end_no

The number of complemented intervals will be at maximum one fewer than the what you start with. (Some will be eliminated if two ranges were actually consecutive.) So it's easy to take each separate intervals and calculate the one just to its right (or left if you wanted to reverse some of the logic.)

shawnt00
  • 16,443
  • 3
  • 17
  • 22
0

Here's a non-CTE version that seems to work: http://sqlfiddle.com/#!9/4fdb4/1

Given the guaranteed disjoint ranges, I just joined T to itself, computed the next range as the increment/decrement of the adjoining range, then ensuring the new range didn't overlap any existing ranges.

select t1.variable_name, t1.end_no+1, t2.start_no-1
  from t t1
  join t t2
     on t1.variable_name=t2.variable_name
where t1.start_no < t2.start_no
  and t1.end_no < t2.end_no
  and not exists (select *
                    from t 
                   where ((t2.start_no-1< t.end_no
                     and t1.end_no+1 > t.start_no) or
                         (t1.end_no + 1 < t.end_no and
                          t2.start_no-1 > t.end_no))
                          and t.variable_name=t1.variable_name)
David W
  • 10,062
  • 34
  • 60
  • The join will be one to many, won't it? Does the `not exists` query handle the extra rows? – shawnt00 Jun 16 '15 at 16:00
  • The intent is for the not exists to handle edge cases where one end of a new possible unsearched range overlaps an existing one; as for generating too many rows, I did not see that in my testing, but perhaps I overlooked a condition. – David W Jun 16 '15 at 16:01
  • I just looked at the fiddle. It does generate extra rows that do end up overlapping so ultimately they are culled. – shawnt00 Jun 16 '15 at 16:05