0

I have a code that takes n inputs and computes the shortest distance between them without ever revisiting the same point twice. I think this the same as the Hamiltonian Path problem.

My code takes n addresses as inputs and iterates over all possible combinations without repeating. Right now I have the 'brute force' method where each loop grabs the start/end location, calcs distance, excludes replicated locations, then adds paths that only visit every point to my Df. Since there are 5 locations, the 5th nested for loop block write the sequence and distance to the DF.

DF with values:

Index   Type    start_point
0       Start   (38.9028613352942, -121.339977998194)
1       A       (38.8882610961556, -121.297759)
2       B       (38.9017768701178, -121.328815149117)
3       C       (38.902337877551, -121.273244306122)
4       D       (38.8627754142291, -121.313577618114)
5       E       (38.882338375, -121.277366625)

My code goes like:

from geopy.distance import vincenty
import pandas as pd
master=pd.DataFrame()
master['locations']=''
master['distance']=''
n=0

df1a=source[source.Type != source.loc[0,'Type']]
df1a=df1a.reset_index(drop=True)
for i1a in df1a.index:
    i1_master=vincenty(source.loc[0,'start_point'],df1a.loc[i1a,'start_point']).mile    s

for i2 in df1a.index:
    df2a=df1a[df1a.Type != df1a.loc[i2,'Type']]
    df2a=df2a.reset_index(drop=True)            
    for i2a in df2a.index:
        if df1a.loc[i1a,'Type']==df2a.loc[i2a,'Type']:
            break
        else:
            i2_master=i1_master+vincenty(df1a.loc[i1a,'start_point'],df2a.loc[i2a,'start_point']).miles

        for i3 in df2a.index:
            df3a=df2a[df2a.Type != df2a.loc[i3,'Type']]
            df3a=df3a.reset_index(drop=True)
            for i3a in df3a.index:
                if df1a.loc[i1a,'Type']==df3a.loc[i3a,'Type']:
                    break
                if df2a.loc[i2a,'Type']==df3a.loc[i3a,'Type']:
                    break
                else:
                    i3_master=i2_master+vincenty(df2a.loc[i2a,'start_point'],df3a.loc[i3a,'start_point']).miles

                for i4 in df3a.index:
                    df4a=df3a[df3a.Type != df3a.loc[i4,'Type']]
                    df4a=df4a.reset_index(drop=True)                            
                    for i4a in df4a.index:
                        if df1a.loc[i1a,'Type']==df4a.loc[i4a,'Type']:
                            break
                        if df2a.loc[i2a,'Type']==df4a.loc[i4a,'Type']:
                            break
                        if df3a.loc[i3a,'Type']==df4a.loc[i4a,'Type']:
                            break
                        else:
                            i4_master=i3_master+vincenty(df3a.loc[i3a,'start_point'],df4a.loc[i4a,'start_point']).miles

                            for i5 in df4a.index:
                            df5a=df4a[df4a.Type != df4a.loc[i5,'Type']]
                            df5a=df5a.reset_index(drop=True)                                    
                            for i5a in df5a.index:
                                if df1a.loc[i1a,'Type']==df5a.loc[i5a,'Type']:
                                    break
                                if df2a.loc[i2a,'Type']==df5a.loc[i5a,'Type']:
                                    break
                                if df3a.loc[i3a,'Type']==df5a.loc[i5a,'Type']:
                                    break
                                if df4a.loc[i4a,'Type']==df5a.loc[i5a,'Type']:
                                    break
                                if df4a.loc[i4a,'Type']==df5a.loc[i5a,'Type']:
                                    break
                                else:
                                    i5_master=i4_master+vincenty(df4a.loc[i4a,'start_point'],df5a.loc[i5a,'start_point']).miles

                                #This loop is special, it calculates distance back to the start.
                                    for i5 in df4a.index:
                                    df5a=df4a[df4a.Type != df4a.loc[i5,'Type']]
                                    df5a=df5a.reset_index(drop=True)                                    
                                    for i5a in df5a.index:
                                        master.loc[n,'locations']=source.loc[0,'Type']+'_'+df1a.loc[i1a,'Type']+'_'+df2a.loc[i2a,'Type']+'_'+df3a.loc[i3a,'Type']+'_'+df4a.loc[i4a,'Type']+'_'+df5a.loc[i5a,'Type']+'_'+source.loc[0,'Type']
                                        master.loc[n,'distance']=i5_master+vincenty(df5a.loc[i5a,'start_point'],df1a.loc[0,'start_point']).miles
                                        n=n+1

Is there a way to use recursive code to build this structure? As a Chemical Engineer I am out of my league ;)

For example: The number of if statements (to check for sequentially repeated start_points ) increases in each section and changes in terms of arguments.

Any other pointers are appreciated.

Merlin
  • 24,552
  • 41
  • 131
  • 206
  • This is a special case of the _Travelling Salesman Problem_, which perhaps the most famous example of an _intractable_ problem - one which cannot be solved in sensible time for any sizable input. –  Jul 02 '16 at 21:26

1 Answers1

1

This is a special case of the Travelling Salesman Problem, which is perhaps the most famous example of an intractable problem - one which cannot be solved in sensible time for any sizable input. Using recursion on this would take O(N!) memory and time, which can only be viable (even on modern systems) for small numbers of inputs (< 10 maybe).

If you are willing to sacrifice perfect solutions for the sake of resources, check out some sub-optimal heuristic solutions here: http://www.math.tamu.edu/~mpilant/math167/Notes/Chapter2.pdf