-1
import pandas as pd
import numpy as np
from geopy import distance
#Import all the files
shop_loc=pd.read_excel('locations.xlsx')
comp_loc=pd.read_excel('locations_comp.xlsx')
#convert the coordinates of both the files to a tuple list for the geopy to read the distance 
shop_loc['coor']=list(zip(shop_loc.Lat,shop_loc.Lon))
comp_loc['coor']=list(zip(comp_loc.Long,comp_loc.Long))
#Function for the calculation of two points
def distance_from(loc1,loc2): 
    dis = distance.distance(loc1, loc2).miles
    return round(dis,2)
#Calculate the distance of one shop location to competitor shop location
for _,row in comp_loc.iterrows():
    shop_loc[row.Comp]=shop_loc['coor'].apply(lambda x: distance_from(row.coor,x))

I have 62 different shops and 8 different competitors in two different files. I am trying to find the distance of how far is each shop with all the 8 different competitors shop. When i do this individually for testing I get the correct locations. But as soon as i put this code out. I get very different distance values.

For instance Shop_location =(40.583639,-75.458446) Competitor_location = (40.049580,-75.086617) In the function i wrote i get almost more than 7900miles, However manually testing the distances gives me a distance of 41.75. Can anyone please help me in where I am going wrong

  • At least, fix the double Longitude -> comp_loc.Long,comp_loc.Long should be Lat/Long. Also, you better use the from sklearn.neighbors DistanceMetric which can calculate the distance matrix for you. - Let me know if you need a working example – Willem Hendriks Aug 27 '20 at 20:06
  • Thank you. I applied the consistency on the Lat and Long naming. Also if there is a working example that you have then I could refer to it and could come in handy. – naman jogani Aug 27 '20 at 22:12
  • https://stackoverflow.com/questions/63142315/how-to-measure-pairwise-distances-between-two-sets-of-points/63143071#63143071 – Willem Hendriks Aug 28 '20 at 06:38

1 Answers1

0

Hey try something like this:

import pandas as pd
import numpy as np
from geopy.distance import geodesic 
#Import all the files
shop_loc=pd.read_excel('locations.xlsx')
comp_loc=pd.read_excel('locations_comp.xlsx')
#convert the coordinates of both the files to a tuple list for the geopy to read the distance 
shop_loc['coor']=list(zip(shop_loc.Lat,shop_loc.Lon))
comp_loc['coor']=list(zip(comp_loc.Long,comp_loc.Long))
#Function for the calculation of two points
def distance_from(loc1,loc2): 
    dis = geodesic(loc1, loc2).miles
    return round(dis,2)
#Calculate the distance of one shop location to competitor shop location
for _,row in comp_loc.iterrows():
    shop_loc[row.Comp]=shop_loc['coor'].apply(lambda x: distance_from(row.coor,x))
omn_1
  • 345
  • 2
  • 14
  • I tried to run this solution. However, the distance that is being calculated is still off. The distance again is calculated as over 7000miles, while running it manually it gives somewhere around 4.5 miles. Let me know if you see any other error. Appreciate your help though. – naman jogani Aug 27 '20 at 22:11
  • I tried running my solution with two lists Shop_location =[40.583639,-75.458446] Competitor_location = [40.049580,-75.086617] and i get the result you are looking for. Unfortunately i can't run your code , so maybe there is an error somewhere else? You can see my results here https://imgur.com/YGdtj8h – omn_1 Aug 27 '20 at 22:21