0

I was trying to code up a brute force algorithm in Python from scratch that solves the Shortest Hamiltonian Path problem for a weighted complete graph as follows:

def min_route(cities, distances):
    """Finds the Shortest Hamiltonian Path for a weighted complete graph.

    Args:
        cities (list of string):
            The vertices of the graph.

        distances (dict):
            The distances between any two cities. Maps each origin city to a
            dictionary that maps destination cities to distances, sort of like
            an adjacency matrix. Type: Dict<string, Dict<string, int>>.

    Returns:
        (list of string, int):
            The list of cities in the optimal route and its length.
    """
    if len(cities) < 2:
        return cities, 0

    best_route, min_dist = None, float('inf')
    for i in range(len(cities)):
        first, rest = cities[i], cities[:i] + cities[i+1:]
        sub_route, sub_dist = min_route(rest, distances)
        route = [first] + sub_route
        dist = sub_dist + distances[first][sub_route[0]]
        if dist < min_dist:
            best_route, min_dist = route, dist

    return best_route, min_dist

It turns out that this algorithm doesn't work and that it's sensitive to the order of the initial list of cities. This confused me, as I thought that it would enumerate all n! possible city permutations, where n is the number of cities. It seems that I'm pruning some of the routes too early; instead, I should do something like:

def min_route_length(cities, distances):
    routes = get_a_list_of_all_permutations_of(cities)
    return min(compute_route_length(route, distances) for route in routes)

Question: What is a simple counterexample that demonstrates why my algorithm is suboptimal?

Follow Up: Is my suboptimal algorithm at least some kind of approximation algorithm that uses some kind of greedy heuristic? Or is it really just a terrible O(n!) algorithm?

Adriano
  • 1,697
  • 24
  • 27

1 Answers1

1

Assuming your graph is directed(can have different weights from A to B and from B to A), one of the counterexamples would be

   A  B  C
A  x  1  5
B 30  x 10
C 30  9  x

Paths not starting from A have their costs of at least 30, so we dont need to consider them. For path starting with A, your code makes recursive call with [B, C]. Their optimal arrangement is C>B with cost 9 and that is the return value from recursive call. However, an entire path A>C>B has cost 14, versus optimal path A>B>C with cost 11.

You're correct that it is O(n!). You just need to pass an extra argument down - starting point (possibly None for the first call) and consider it when calculating dist.