I have two dataframes df1
and df2
.
I want to find the closest point from df2
corresponding to the points in the df1
.
Once a point is selected from df2
corresponding to the point in df1
, remove it from df2
dataframe (To avoid the repetition of points in df1
) and move to the next point in df1
.
df1
X Y
0 74.7 30.7
1 74.9 30.7
2 75.1 30.7
3 75.3 30.7
4 75.5 30.7
df2
X Y
0 75.80 33.00
1 75.80 33.00
2 76.40 33.00
3 75.80 33.00
4 76.38 33.00
5 76.45 33.00
This answer is pretty close but to avoid the chances of duplicate values, How to return the index of the closest point along with the point itself?
My code
from scipy.spatial.distance import cdist
def closest_point(point, points):
""" Find closest point from a list of points. """
return cdist([point], points).argmin(), points[cdist([point], points).argmin()]
dff1=pd.DataFrame()
dff2=pd.DataFrame()
dff1['point'] = [(x, y) for x,y in zip(df1['Y'], df1['X'])]
dff2['point'] = [(x, y) for x,y in zip(df2['Y'], df2['X'])]
P=[]
for x in dff1['point']:
idx,p=closest_point(x, list(dff2['point']))
P.append(p)
dff2.drop(idx,inplace=True)
But somehow this is not working
How to solve this ? please suggest some solutions.