0

Working with Python, so my actual problem is computationally finding the closest 2D point from an array (thousands of points) which represents a road network coordinates to a given point (the car location). This calculation required every time stamp of 0.2 seconds (online). While examining the closest pair of point problem algorithm, it finds the closest point within a certain array and not with a respect to a given point as I willing to find. Does anyone familiar with python implementation or a suitable algorithm? Any help is welcome.

Ran
  • 1
  • 5
  • What does the array look like? Brute forcing is just O(n) at worst anyway. – fractals Aug 21 '18 at 12:24
  • 1
    Did you try using Haversine Distance to compute distance between two Geo-Coordinates? – V Sree Harissh Aug 21 '18 at 12:24
  • HSK - unfortunately, O(n) is not desired because that the road network can be 90 km (for example) and each meter defined approximately by 3 coordinates. CoDhEr- I didn't, this is not a spherical terrain but flat. – Ran Aug 21 '18 at 12:41
  • I don't see how you can go faster than O(n). Unless your data is ordered somehow, in which case you can try a binary seach approach which gives you a O(log n) complexity, but you need the order to be already established and to be meaningful for your search – bracco23 Aug 21 '18 at 13:12
  • Would you accept methods that require some preprocessing of the existing points? – Pablo M Aug 21 '18 at 13:29
  • 1
    bracco23 and Pablo M - the array can pre-sorted so the k-d binary tree can be a good solution. I'll update if it works. tnx! – Ran Aug 21 '18 at 13:36
  • Yep, k-d binary sounds like the right thing for your problem. Would be great if you post an answer to let us know how you do with it. Good luck – Pablo M Aug 21 '18 at 13:40

2 Answers2

0
  1. If your data is unordered, you wont have any other choice then checking every point of your array.

  2. If your data is ordered ascending/descending (by coordinates for example) you can base your search on that. A binary search as suggested in the comments for example.

  3. If your data represents a road network and the datastructure is similar to the real 2D topology of that network, you could take advantage of that and keep a "pointer" to your current position and just explore the near surroundings of that position and find your point somewhere around the current position.

Edit: Just another point: If you need to compare distances to determine the closest one (possibly using pythagoras to calculate the euclidian distance) dont compute the roots if you dont have to, you can just compare the quadratic distances aswell and this will save some operations (which can sum up quickly if you do it very often)

ChrisN
  • 74
  • 3
0

thanks to you all - the following is my solution (syntactic one):

import numpy as np
from sklearn.neighbors import KDTree

np.random.seed(0)
samples = np.random.uniform(low=0, high=500, size=(1000000, 2))
samples.sort(kind='quicksort')
tree = KDTree(samples)
car_location = np.array([[200, 300]])
closest_dist, closest_id = tree.query(car_location, k=1)  # closest point with respect to the car location
print(samples)
print("The car location is: ", car_location)
print("The closest distance is: ", closest_dist)
print("The closest point is: ", samples[closest_id])

The output:

[[ 274.40675196  357.59468319]
 [ 272.4415915   301.38168804]
 [ 211.82739967  322.94705653]
 ..., 
 [ 276.40594173  372.59594432]
 [ 464.17928848  469.82174863]
 [ 245.93513736  445.84696018]]
The car location is:  [[200 300]]
The closest distance is:  [[ 0.31470178]]
The closest point is:  [[[ 199.72906435  299.8399029 ]]]

ChrisN - The above is a solution for your 2nd suggestion. Your 3rd sounds definitely a better computationally solution. however, I think that my solution will satisfy the required problem. if it won't I'll try to post a solution for the 3rd. tnx!

Ran
  • 1
  • 5