1

I have two dataframes df1 and df2.

I want to find the closest point from df2 corresponding to the points in the df1. Once a point is selected from df2 corresponding to the point in df1, remove it from df2 dataframe (To avoid the repetition of points in df1) and move to the next point in df1.

df1

        X     Y
0    74.7  30.7
1    74.9  30.7
2    75.1  30.7
3    75.3  30.7
4    75.5  30.7

df2

      X     Y
0   75.80  33.00
1   75.80  33.00
2   76.40  33.00
3   75.80  33.00
4   76.38  33.00
5   76.45  33.00

This answer is pretty close but to avoid the chances of duplicate values, How to return the index of the closest point along with the point itself?

My code

from scipy.spatial.distance import cdist

def closest_point(point, points):
    """ Find closest point from a list of points. """
    return cdist([point], points).argmin(), points[cdist([point], points).argmin()]

dff1=pd.DataFrame() 
dff2=pd.DataFrame()
dff1['point'] = [(x, y) for x,y in zip(df1['Y'], df1['X'])]
dff2['point'] = [(x, y) for x,y in zip(df2['Y'], df2['X'])]

P=[]
for x in dff1['point']:
    idx,p=closest_point(x, list(dff2['point']))
    P.append(p)
    dff2.drop(idx,inplace=True)

But somehow this is not working

How to solve this ? please suggest some solutions.

vgb_backup
  • 49
  • 1
  • 6
  • 1
    Are you finding the closest point in df2 for each point in df1? If so how would you do when a point in df2 is the closest for multiple points in df1? – Kota Mori Dec 09 '21 at 08:46
  • What I had in mind was, once we assign the closest point from df2 to a point in df1, removing it from from df2 solve this confusion. So the other points would find some alternate closest point. – vgb_backup Dec 09 '21 at 08:57

0 Answers0