Fast way to find the closest polygon to a point

Question

I have a list of Shapely polygons and a point like so:

from shapely.geometry import Point, Polygon

polygons = [Polygon(...), Polygon(...), ...]
point = Point(2.5, 5.7)

and I want to find the closest polygon in the list to that point. I'm already aware of the object.distance(other) function which returns the minimum distance between two geometric shapes, and I thought about computing all the distances in a loop to find the closest polygon:

polygons = [Polygon(...), Polygon(...), ...]
point = Point(2.5, 5.7)
min_dist = 10000
closest_polygon = None

for polygon in polygons:
    dist = polygon.distance(point)
    if dist < min_dist:
        min_dist = dist
        closest_polygon = polygon

My question is: Is there a more efficient way to do it?

Could this be of any help: [Looking for a fast way to find the polygon a point belongs to using Shapely](https://stackoverflow.com/questions/20297977/looking-for-a-fast-way-to-find-the-polygon-a-point-belongs-to-using-shapely)? I've never worked with the suggested R-trees, so I don't know if it can help in your case. — Georgy, Jul 30 '19 at 15:09
Btw, you can write your current code more concise as `closest_polygon = min(polygons, key=point.distance)`. — Georgy, Jul 30 '19 at 15:15
I think it's a good idea. I am also working on a solution using kd-trees... — ma3oun, Jul 30 '19 at 15:21

Drey · Answer 1 · 2019-07-30T17:45:35.717

2

There is a shorter way, e.g.

from shapely.geometry import Point, Polygon
import random
from operator import itemgetter


def random_coords(n):
    return [(random.randint(0, 100), random.randint(0, 100)) for _ in range(n)]


polys = [Polygon(random_coords(3)) for _ in range(4)]
point = Point(random_coords(1))

min_distance, min_poly = min(((poly.distance(point), poly) for poly in polys), key=itemgetter(0))

as Georgy mentioned (++awesome!) even more concise:

min_poly = min(polys, key=point.distance)

but distance computation is in general computationally intensive

edited Jul 30 '19 at 17:45

answered Jul 30 '19 at 14:40

Drey

3,314
2
21
26

It is indeed a shorter solution, but I just tested it and it takes only slightly more time than the loop. Thanks for sharing anyway! – SalmaFG Jul 30 '19 at 15:02

score 1 · Answer 2 · answered Jul 30 '19 at 16:25

I have a solution that works if you have at least 2 polygons with a distance different from 0. Let's call these 2 polygons "basePolygon0" and "basePolygon1". The idea is to build a KD tree with the distance of each polygon to each of the "basis" polygons. Once the KD tree has been built, we can query it by computing the distance to each of the basis polygons.

Here's a working example:

from shapely.geometry import Point, Polygon
import numpy as np
from scipy.spatial import KDTree 

# prepare a test with triangles
poly0 = Polygon([(3,-1),(5,-1),(4,2)])
poly1 = Polygon([(-2,1),(-4,2),(-3,4)])
poly2 = Polygon([(-3,-3),(-4,-6),(-2,-6)])
poly3 = Polygon([(-1,-4),(1,-4),(0,-1)])

polys = [poly0,poly1,poly2,poly3]

p0 = Point(4,-3)
p1 = Point(-4,1)
p2 = Point(-4,-2)
p3 = Point(0,-2.5)

testPoints = [p0,p1,p2,p3]

# select basis polygons
# it works with any pair of polygons that have non zero distance
basePolygon0 = polys[0]
basePolygon1 = polys[1]

# compute tree query
def buildQuery(point):
    distToBasePolygon0 = basePolygon0.distance(point)
    distToBasePolygon1 = basePolygon1.distance(point)
    return np.array([distToBasePolygon0,distToBasePolygon1])


distances = np.array([buildQuery(poly) for poly in polys])

# build the KD tree
tree = KDTree(distances)

# test it
for p in testPoints:
    q = buildQuery(p)
    output = tree.query(q)
    print(output)

This yields as expected:

# (distance, polygon_index_in_KD_tree)
(2.0248456731316584, 0)
(1.904237866994273, 1)
(1.5991500555008626, 2)
(1.5109986459170694, 3)

score 0 · Answer 3 · answered Jul 30 '19 at 14:41

There is one way that might be faster, but without doing any actual tests, it's hard for me to say for sure.

This might not work for your situation, but the basic idea is each time a Shapely object is added to the array, you adjust the position of different array elements so that it is always "sorted" in this manner. In Python, this can be done with the heapq module. The only issue with that module is that it's hard to choose a function to compare to different objects, so you would have to do something like this answer, where you make a custom Class for objects to put in the heapq that is a tuple.

import heapq

class MyHeap(object):
   def __init__(self, initial=None, key=lambda x:x):
       self.key = key
       if initial:
           self._data = [(key(item), item) for item in initial]
           heapq.heapify(self._data)
       else:
           self._data = []

   def push(self, item):
       heapq.heappush(self._data, (self.key(item), item))

   def pop(self):
       return heapq.heappop(self._data)[1]

The first element in the tuple is a "key", which in this case would be the distance to the point, and then the second element would be the actual Shapely object, and you could use it like so:

point = Point(2.5, 5.7)
heap = MyHeap(initial=None, key=lambda x:x.distance(point))

heap.push(Polygon(...))
heap.push(Polygon(...))
# etc...

And at the end, the object you're looking for will be at heap.pop().

Ultimately, though, both algorithms seem to be (roughly) O(n), so any speed up would not be a significant one.

Fast way to find the closest polygon to a point

3 Answers3