0

I have a list with two elements like this:

list_a = [27.666521, 85.437447]

and another list like this:

big_list = [[27.666519, 85.437477], [27.666460, 85.437622], ...]

And I want to find the closest match of list_a within list_b.

For example, here the closest match would be [27.666519, 85.437477].

How would I be able to achieve this?

I found a similar problem here for finding the closest match of a string in an array but was unable to reproduce it similarly for the above mentioned problem.

P.S.The elements in the list are the co-ordinates of points on the earth.

Community
  • 1
  • 1
SUB0DH
  • 5,130
  • 4
  • 29
  • 46

2 Answers2

10

From your question, it's hard to tell how you want to measure the distance, so I simply assume you mean Euclidean distance.

You can use the key parameter to min():

from functools import partial

def distance_squared(x, y):
    return (x[0] - y[0])**2 + (x[1] - y[1])**2

print min(big_list, key=partial(distance_squared, list_a))
Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
  • 1
    Beat me to it by seconds, and yours is more generic. I love the use of partial. Have a well-deserved upvote! – Lauritz V. Thaulow Jul 24 '12 at 11:42
  • Thanks for the answer. I should have mentioned in my question that the distance between the points was the distance between two co-ordinates on earth, but I guess that for a very small difference in distance it wouldn't matter much. – SUB0DH Jul 28 '12 at 02:53
1

Assumptions:

  • You intend to make this type query more than once on the same list of lists
  • Both the query list and the lists in your list of lists represent points in a n-dimensional euclidean space (here: a 2-dimensional space, unlike GPS positions that come from a spherical space).

This reads like a nearest neighbor search. Probably you should take into consideration a library dedicated for this, like scikits.ann.

Example:

import scikits.ann as ann
import numpy as np
k = ann.kdtree(np.array(big_list))
indices, distances = k.knn(list_a, 1)

This uses euclidean distance internally. You should make sure, that the distance measure you apply complies your idea of proximity.

You might also want to have a look on Quadtree, which is another data structure that you could apply to avoid the brute force minimum search through your entire list of lists.

moooeeeep
  • 31,622
  • 22
  • 98
  • 187