2

I have several polygons which are actually a union of points. I want a reasonably quick way of removing the points which are not land (but on river, lake, ocean, etc).

So I far I have come up with the below method which takes me from the left picture to the right one:

enter image description here

import fiona
from shapely.geometry import Point, shape

# Shape files (Small Lakes Europe, Lakes, Coast)
le_loc = ".../water-polygons-split-4326/ne_10m_lakes_europe.shp"
lw_loc = ".../water-polygons-split-4326/ne_10m_lakes.shp"
c_loc = ".../water-polygons-split-4326/water_polygons.shp"

print("Opening Shape-Files")
le_poly = [shape(pol['geometry']) for pol in fiona.open(le_loc)]
lw_poly = [shape(pol['geometry']) for pol in fiona.open(lw_loc)]
c_poly = [shape(pol['geometry']) for pol in fiona.open(c_loc)]

def point_in_water(lat, lon):
    xy_point = Point(lon, lat) 
    for one in [c_poly, lw_poly, le_poly]:
        for shape in one:
            if shape.contains(xy_point):
                return True
    return False

# Test (true)
print(point_in_water(46.268408, 6.180437))

Then in a loop I test my points:

with open(in_csv) as f:
    for x in csv.reader(f):
        # Lat, Lng, Mins, Mode
        if not point_in_water(float(x[0]), float(x[1])):
           coords.append([x[0], x[1])

I use three shape-files which are OK for my purpose (the lakes one is a bit rough): coast, lakes, small lakes.

However, for 10,000 points the code is a bit slow (and I have around 30 files so 300,000 points to examine).

I was wondering if any of the following would be possible:

1) I am looping through shapes and checking shape.contains(point) instead of looping through points and checking point.within(shape) -> I wasn't sure if there would be an improvement?

2) Using a spatial-index would probably speed this up, however I don't think RTree works anymore with Python 3.4

3) Perhaps there is a faster function (a rough contains) which just checks bounds and I can use that as step one and then contains as step 2.

4) Instead of looping through points, is there a way to vectorise and pass all at once?

5) Perhaps it would be faster to convert the shapely Polygons to a path so that I can use matpotlib's path.contains_point?

6) Finally, I realise that I should probably use a Mercator projection for the point in poly test, but a rough cut is fine for me (it's not like the water shapefile is super accurate anyway).

Thank you

mptevsion
  • 937
  • 8
  • 28
  • disclaimer, I havent used shapely or fiona that said, do you need the points within the boundaries or just the boundaries? If just the boundaries then maybe a way of simplifying things would be to use [Convex Hull](http://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.ConvexHull.html#scipy.spatial.ConvexHull) – DrBwts Jan 04 '16 at 11:53
  • @DrBwts Unfortunately I do need to keep the points since I am just dropping points in the water – mptevsion Jan 04 '16 at 20:11
  • `rtree` version 0.8 introduced support for Python 3.x, so try and get that. A spatial index will vastly improve your performance. – Mike T Jan 10 '16 at 23:17
  • To speed up the process I would try to get the union of all "water" polygons first, and then check each point against this multipolygon. I mean, it will be a huge multipolygon, and the union operation can take a while, but I think that if you have a lot of points, this could really speed up the process. – eguaio Feb 29 '16 at 17:41

0 Answers0