2

I have two lists of locations as csv files (mlloc and cfloc) in the same format with the following headers:

index|formatted_address|latitude|longitude|postcode|input_string|number_of_results|status

I wish to measure the distance between each location (as defined by latitude and longitude) in mlloc and each location in cfloc. I have used geopy.distance to measure individual pairs, but I have not managed to use it to measure this pair-wise matrix.

I have tried the following, which returns a single distance.

It would be a bonus if I could attach to the results the formatted_address value for each of the pair of points we're measuring between.

#import 
from geopy import distance as geopy_distance
import pandas as pd
import csv

#define inputs
ml_filename = 'C:/Users/hc/Desktop/mlloc.csv'
cf_filename = 'C:/Users/hc/Desktop/cf.csv'

#read data
mldata = pd.read_csv(ml_filename)
cfdata = pd.read_csv(cf_filename)

#create dataframes
mldata = pd.DataFrame(mldata)
cfdata = pd.DataFrame(cfdata)

#drop NaN
mldata = mldata.drop(mldata[mldata.status != 'OK'].index)
cfdata = cfdata.drop(cfdata[cfdata.status != 'OK'].index)

#check
print(cfdata.head())

#loop through each dataframe
results = []
for i in range (1, len(mldata)):
    for j in range (1, len(cfdata)):
#distance calculation
        distance = geopy_distance.distance((mldata.iloc[i , 2], mldata.iloc[i , 3]), (cfdata.iloc[j , 2], cfdata.iloc[j , 3]))
        answer = [distance, i, j]
        results.append(answer)

print(results)

0 Answers0