Using SciPy's minimize to find the shortest path in a graph

Question

I am trying to find the shortest path in the following graph from G to C and I wrote the following code to accomplish that.

First I give the equations and constraints I believe I should use:

We maximize dc subject to:

db-da <= 8

df-da <= 10

dc-db <= 4

dd-dc <= 3

de-dd <= 25

df-dd <= 18

dd-de <= 9

dg-de <= 7

da-df <= 5

db-df <= 7

dc-df <= 3

de-df <= 2

dd-dg <= 2

dh-dg <= 3

da-dh <= 4

db-dh <= 9

import numpy as np
import scipy as scp
from scipy.optimize import minimize

a,b,c,d,e,f,g,h = 0,1,2,3,4,5,6,7

def objective(x, sign = -1.0):
    return sign*x[c]

def c1(x, sign = -1.0):
    return sign*(x[b]-x[a]-8)
def c2(x, sign = -1.0):
    return sign*(x[f]-x[a]-10)
def c3(x, sign = -1.0):
    return sign*(x[c]-x[b]-4)
def c4(x, sign = -1.0):
    return sign*(x[d]-x[c]-3)
def c5(x, sign = -1.0):
    return sign*(x[e]-x[d]-25)
def c6(x, sign = -1.0):
    return sign*(x[f]-x[d]-18)
def c7(x, sign = -1.0):
    return sign*(x[d]-x[e]-9)
def c8(x, sign = -1.0):
    return sign*(x[g]-x[e]-7)
def c9(x, sign = -1.0):
    return sign*(x[a]-x[f]-5)
def c10(x, sign = -1.0):
    return sign*(x[b]-x[f]-7)
def c11(x, sign = -1.0):
    return sign*(x[c]-x[f]-3)
def c12(x, sign = -1.0):
    return sign*(x[e]-x[f]-2)
def c13(x, sign = -1.0):
    return sign*(x[d]-x[g]-2)
def c14(x, sign = -1.0):
    return sign*(x[h]-x[g]-3)
def c15(x, sign = -1.0):
    return sign*(x[a]-x[h]-4)
def c16(x, sign = -1.0):
    return sign*(x[b]-x[h]-9)

def c17(x, sign = -1.0):
    return x[g]

cs = [c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12,c13,c14,c15,c16,c17]

x0 = [0 for i in range(8)]

b = (0,None)
bs = tuple([b for i in range(8)])

cons = []
for i in range(16):
    cons.append({'type': 'ineq', 'fun':cs[i]})
cons.append({'type': 'eq', 'fun':c17})

sol = minimize(objective, x0, method = 'SLSQP', bounds=bs, constraints=cons)
for val in sol['x']:
    print(round(val))

It is possible to merely use algebra to solve for each of the variables but it is supposed to be efficient to use LP to do it instead.

I believe just by a manual trace through the graph that the optimal path is G-H-B-C with a total cost of 16. However, the code above indicates that the optimal path is G-H-A-F-C with costs that make sense until they don't: 3-4-1-3. It says that the optimal path length is 11. It seems so close to a valid answer, except that it thinks you can go from A to F with a cost of 1.

[Edit: I just noticed I missed the edge from B to E but it doesn't seem to matter, and indeed when I add it in the algorithm's answer doesn't change.]

This is not going to work for two reasons: 1. the possible paths in a graph are a discrete set, but `minimize` is designed to work on a continuous domain. 2. `minimize` will find a local optimum, it is not guaranteed to find the globally best solution. You may have more luck with a global optimizer like [`differential_evolution`](https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.optimize.differential_evolution.html). — MB-F, Feb 21 '18 at 07:18
What is this index with the global variable `c` in the objective function suppodes to do? — MB-F, Feb 21 '18 at 07:19
It's a valid approach to compute shortest-path with LPs (without integer-programming). This is based on the concept of total unimodularity. **But** there are assumptions on the solver (basic solution) and ```scipy.minimize``` is not an LP-solver! There is ```scipy.optimize.linprog(method='simplex')``` but it's not very very high-quality although it might work. (don't solve LPs with general NLP-methods like SLSQP; things might change and will do here; another unrelated example: SOCP-solver vs. SOCP with NLP: in NLP an SOC-constraint is non-smooth and can't be tackled without reformulation). — sascha, Feb 21 '18 at 12:41
Nice writing about it - http://yetanothermathprogrammingconsultant.blogspot.co.il/2018/02/modeling-vs-programming-python-vs-gams.html. — Royi, Feb 22 '18 at 11:11

sascha · Accepted Answer · 2018-02-21T14:27:12.803

This (working) code is:

somewhat ugly (no careful analysis of nicely usable graph data-structures for this task)
uses scipy's linprog(method='simplex'), which i do not trust anymore (see issues at github)
follows the LP described at wikipedia
is not for real-world usage
- inefficient data-structures
- inefficient and dense-only solver

Please make sure to read my comment above!

Code

import numpy as np
from scipy.optimize import linprog

""" DATA"""
edges = [('A', 'B', 8),
         ('A', 'F', 10),
         ('B', 'C', 4),
         ('B', 'E', 10),
         ('C', 'D', 3),
         ('D', 'E', 25),
         ('D', 'F', 18),
         ('E', 'D', 9),
         ('E', 'G', 7),
         ('F', 'A', 5),
         ('F', 'B', 7),
         ('F', 'C', 3),
         ('F', 'E', 2),
         ('G', 'D', 2),
         ('G', 'H', 3),
         ('H', 'A', 4),
         ('H', 'B', 9)]
s, t = 'G', 'C'

""" Preprocessing """
nodes = sorted(set([i[0] for i in edges]))  # assumption: each node has an outedge
n_nodes = len(nodes)
n_edges = len(edges)

edge_matrix = np.zeros((len(nodes), len(nodes)), dtype=int)
for edge in edges:
    i, j, v = edge
    i_ind = nodes.index(i)  # slow
    j_ind = nodes.index(j)  # """
    edge_matrix[i_ind, j_ind] = v

nnz_edges = np.nonzero(edge_matrix)
edge_dict = {}
counter = 0
for e in range(n_edges):
    a, b = nnz_edges[0][e], nnz_edges[1][e]
    edge_dict[(a,b)] = counter
    counter += 1

s_ind = nodes.index(s)
t_ind = nodes.index(t)

""" LP """
bounds = [(0, 1) for i in range(n_edges)]
c = [i[2] for i in edges]

A_rows = []
b_rows = []
for source in range(n_nodes):
    out_inds = np.flatnonzero(edge_matrix[source, :])
    in_inds = np.flatnonzero(edge_matrix[:, source])

    rhs = 0
    if source == s_ind:
        rhs = 1
    elif source == t_ind:
        rhs = -1

    n_out = len(out_inds)
    n_in = len(in_inds)

    out_edges = [edge_dict[a, b] for a, b in np.vstack((np.full(n_out, source), out_inds)).T]
    in_edges = [edge_dict[a, b] for a, b in np.vstack((in_inds, np.full(n_in, source))).T]

    A_row = np.zeros(n_edges)
    A_row[out_edges] = 1
    A_row[in_edges] = -1

    A_rows.append(A_row)
    b_rows.append(rhs)

A = np.vstack(A_rows)
b = np.array(b_rows)
res = linprog(c, A_eq=A, b_eq=b, bounds=bounds)
print(res)

Output:

fun: 16.0
message: 'Optimization terminated successfully.'
nit: 11
slack: array([1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 0.])
status: 0
success: True
  x: array([0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1.])

The LP is then looking like:

bounds
[(0, 1), (0, 1), (0, 1), (0, 1), (0, 1), (0, 1), (0, 1), (0, 1), (0, 1), (0, 1), (0, 1), (0, 1), (0, 1), (0, 1), (0, 1), (0, 1), (0, 1)]

objective / c
[8, 10, 4, 10, 3, 25, 18, 9, 7, 5, 7, 3, 2, 2, 3, 4, 9]

constraint-matrix A_eq / A
[[ 1.  1.  0.  0.  0.  0.  0.  0.  0. -1.  0.  0.  0.  0.  0. -1.  0.]
 [-1.  0.  1.  1.  0.  0.  0.  0.  0.  0. -1.  0.  0.  0.  0.  0. -1.]
 [ 0.  0. -1.  0.  1.  0.  0.  0.  0.  0.  0. -1.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0. -1.  1.  1. -1.  0.  0.  0.  0.  0. -1.  0.  0.  0.]
 [ 0.  0.  0. -1.  0. -1.  0.  1.  1.  0.  0.  0. -1.  0.  0.  0.  0.]
 [ 0. -1.  0.  0.  0.  0. -1.  0.  0.  1.  1.  1.  1.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0. -1.  0.  0.  0.  0.  1.  1.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0. -1.  1.  1.]]

rhs / b
[ 0  0 -1  0  0  0  1  0]

which shows that one really should use a solver exploiting sparsity!

Using SciPy's minimize to find the shortest path in a graph

1 Answers1

Code

Output: