Background
I'm performing an iterative traffic assignment (ITA) on a directed weighted graph with ~12k nodes and ~25k edges. At each stage of the four iterations in ITA, I have to find the shortest path between an origin and a set of destinations (i.e., all the origins). The pseudocode looks like this:
for iteration in iterations:
for origin in origins:
paths = find the shortest paths between origin and destinations
for destination in destinations:
for each edge between origin and destination:
assign traffic to edge
compute some quantities based on path properties
There are ~30 nodes that are origins/destinations. The code I'm using is currently in Python 2.7 and uses networkx 1.8.1 to find the shortest paths between an origin and all destinations -- specifically, the function networkx.single_source_dijkstra_path
.
Question
One call of ITA takes ~6.2 seconds on my local machine; about 95% of the time ITA takes to run is just finding these shortest paths. Since graph-tool
has been shown to be 100x faster than networkx
at finding shortest paths per its own documentation, I tried implementing the same code using graph-tool
functions. Of note: the documentation of graph-tool
's performance is based on a different machine than the one I am using (a MacBook Pro).
I've profiled the performance of networkx
(version 1.8 in Python 2.7) and graph-tool
(version 2.35 in Python 3.6), considering two metrics: (a) the time to complete one call of ITA and (b) the average time to find a set of paths between an origin and destination using shortest path functions in each package.
networkx
(a) 6.2 seconds (b) 0.036 seconds- using
paths_dict = networkx.single_source_dijkstra_path(G, origin, cutoff=None, weight='t_a')
- using
graph-tool
(a) 6.8 seconds (b) 0.050 seconds- using
paths_dict = {destination:topology.shortest_path(G, origin, destination, weights=G.edge_properties.ta) for destination in od_dict[origin]}
where topology is graphtool.
- using
Why is graph-tool
slower than networkx
in my code? Is there a faster way to implement a single-origin-to-multiple-destinations shortest path search in graph-tool
?
Full code
Here's the relevant portion of the ITA algorithm using graph-tool
.
def test_traffic_assignment_graph_tool():
iteration_vals = [0.4, 0.3, 0.2,
0.1] # assign od vals in this amount per iteration. These are recommended values from the Nature paper, http://www.nature.com/srep/2012/121220/srep01001/pdf/srep01001.pdf
G = gt.load_graph("input/graphMTC_GB.gml")
original_node_ids = [G.vertex_properties.label[temp] for temp in G.vertices()] # these are the original node IDs (match the networkx graph)
new_node_ids = [G.vertex_index[v] for v in G.vertices()] # these are the new node IDs assigned by graphtool
# Create a mapping from original to new node ids -- since G.get_vertices() always returns the same order, it's ok.
original_to_new = dict(zip(original_node_ids, new_node_ids))
new_to_original = dict(zip(new_node_ids, original_node_ids))
demand = bd.build_demand('input/BATS2000_34SuperD_TripTableData.csv',
'input/superdistricts_centroids_dummies.csv')
overall_start = time.time()
paths_time = []
# sort OD pairs to fix inconsistency across different runs of the traffic assignment
origins = [int(i) for i in demand.keys()] # get SD node IDs as integers
origins.sort() # sort them
origins = [str(i) for i in origins] # make them strings again
od_dict = bd.build_od(
demand)
for i in range(len(iteration_vals)): # do 4 iterations
for origin in origins:
paths_start = time.time()
paths_dict = {destination:topology.shortest_path(G, original_to_new[origin], original_to_new[destination], weights=G.edge_properties.ta) for destination in od_dict[origin]}
paths_time.append(time.time() - paths_start)
overall_end = time.time()
print('Graphtool total pathfinding time = ', sum(paths_time))
print('Graphtool average pathfinding time = ', sum(paths_time) / len(paths_time))