11

I have this data:

IF OBJECT_ID('tempdb..#temp') IS NOT NULL
    DROP TABLE #temp
CREATE TABLE #temp
    (
      Id INT IDENTITY(1, 1) ,
      X FLOAT NOT NULL ,
      Y FLOAT NOT NULL
    )

INSERT INTO #temp (X, Y) VALUES (0, 0)
INSERT INTO #temp (X, Y) VALUES (0, 1)
INSERT INTO #temp (X, Y) VALUES (0, 2)
INSERT INTO #temp (X, Y) VALUES (0.5, 1)
INSERT INTO #temp (X, Y) VALUES (1, 1)
INSERT INTO #temp (X, Y) VALUES (1, 2)
INSERT INTO #temp (X, Y) VALUES (1.5, 0.5)
INSERT INTO #temp (X, Y) VALUES (2, 0)
INSERT INTO #temp (X, Y) VALUES (2, 1)

I would like to remove points that are contained within other points, such as:

(0, 1)
(1, 1)
(1.5, 0.5)

to obtain the outer most points that define the outer polygon consisting of only vertical and horizontal lines without redundancies (e.g. (0, 1) is a redundant point). Can this be achieved with a set based TSQL approach in SQL Server 2014?

PS:

A scatter plot of the data is as follows:

enter image description here

I would like to remove the encircled points. Ultimately, I am after the outer border (drawn as red lines). Hope this makes it clearer.

cs0815
  • 16,751
  • 45
  • 136
  • 299
  • 1
    So what is the desired result? – Tim Schmelter Jul 02 '15 at 09:29
  • Sorry if I was not clear. Remove the points: (0, 1), (1, 1), (1.5, 0.5) – cs0815 Jul 02 '15 at 09:31
  • I have added an additional picture. Hope that makes it clearer. – cs0815 Jul 02 '15 at 09:37
  • 3
    Question is ambiguous as the same points can create another polygon. For example you could connect (1,2) with (1.5,.5) and then with (2,1) which would then contain the point (1,1) instead of (1.5,.5). – ughai Jul 02 '15 at 10:00
  • 2
    @ughai: he wants only the vertices of the outermost polygon. – Tim Schmelter Jul 02 '15 at 10:43
  • @ughai - yes Tim is right - sorry, if I was not clear. – cs0815 Jul 02 '15 at 10:57
  • 2
    ughai is correct and the issue has not been answered as far as I can see. What about the polygon where (1,2) is joined to (2,1)? That could be considered "outermost". Are we saying a convex polygon composed of horizontal and vertical lines only? – Fruitbat Jul 02 '15 at 11:17
  • Ok you are right. I think the technical term for this would convex hull. Yes, let us restrict this to outer most horizontal and vertical lines only. I will update the original question. – cs0815 Jul 02 '15 at 11:26
  • Actually, now I think about it, your red outline isn't convex. You would need (1,2) to join directly to (2,1). Also, for an arbitrary set of points, the condition "vertical and horizontal only" may not be achievable (for example if you had (0,2.5) instead of (0,2). Needs a rethink. – Fruitbat Jul 02 '15 at 11:35
  • @Fruitbat - to be honest all this is a simplification of my actual problem, which is related to this: http://stackoverflow.com/questions/31178902/combining-geography-data where I try to use STUnion to find the combining polygon of in fact a lot of 4 corner polygons ("squares" with top left, lower left, lower right and top right points). – cs0815 Jul 02 '15 at 11:39
  • you don't want to remove (1,1), right? you want to remove (0.5,1)? – Beth Jul 02 '15 at 15:20
  • @Beth that's correct. – cs0815 Jul 02 '15 at 19:29

1 Answers1

1

I believe this might work. It seems to deliver on your test data. A bit rough. Some of the SELECT MIN and SELECT MAX could perhaps be calculated in advance if your real data is large.

SELECT * 
-- uncomment this to delete the desired points
-- DELETE #temp
FROM #temp t
WHERE 
(
    -- Internal points
    (
            ( X > (SELECT MIN(X) FROM #temp) AND X < (SELECT MAX(X) FROM #temp) )
        AND ( Y > (SELECT MIN(Y) FROM #temp) AND Y < (SELECT MAX(Y) FROM #temp) )
    )
    -- Exceptions (points with nothing strictly outside them) [Don't want to lose (1,1)]
    AND EXISTS (SELECT * FROM #temp WHERE X > t.X AND Y > t.Y)
)
OR
    -- redundant edge points [(0,1) would be included as an "exception"]
(
    ( (t.X = (SELECT MIN(X) FROM #temp) OR t.X = (SELECT MAX(X) FROM #temp)) 
        AND EXISTS (SELECT * FROM #temp WHERE X = t.X AND Y > t.Y) 
        AND EXISTS (SELECT * FROM #temp WHERE X = t.X AND Y < t.Y)  )
    OR
    ( (t.Y = (SELECT MIN(Y) FROM #temp) OR t.Y = (SELECT MAX(Y) FROM #temp)) 
        AND EXISTS (SELECT * FROM #temp WHERE Y = t.Y AND X > t.X) 
        AND EXISTS (SELECT * FROM #temp WHERE Y = t.Y AND X < t.X)  )
)
Fruitbat
  • 764
  • 2
  • 5
  • 19