I have a network present in a postgres database, where I can route with the pgrouting extension. I've read this into mem, and now want to calculate the distance of all nodes within 0.1 hours from a specific starting node:
dm = G.new_vp("double", np.inf)
gt.shortest_distance(G, source=nd[102481678], weights=wgts, dist_map = dm, max_dist=0.1)
where wgts is an EdgePropertyMap containing the weights per edge, and nd is a reverse mapping to get vertex index from the outside id.
In pgRouting this delivers 349 reachable nodes, using graph-tool only 328. The results are more or less the same (e.g. the furthest node is the same with the exact same cost, nodes present in both lists have same distance), but the graph-tool distance map just seems to miss certain nodes. The weird thing is that I found a cul-de-sac node labeled with a distance (second one from below), but the node connecting the cul-de-sac with the outside world is missing. Seems weird, because if the connecting node would not be reachable, the cul-de-sac would be unreachable as well.
I've compiled a MWE: https://gofile.io/d/YpgjSw
Below is the python code:
import graph_tool.all as gt
import numpy as np
import time
# construct list of source, target, edge-id (edge-id not really used in this example)
l = []
with open('nw.txt') as f:
rows = f.readlines()
for row in rows:
id = int(row.split('\t')[0])
source = int(row.split('\t')[1])
target = int(row.split('\t')[2])
l.append([source, target, id])
l.append([target, source, id])
print len(l)
# construct graph
G = gt.Graph(directed=True)
G.ep["edge_id"] = G.new_edge_property("int")
n = G.add_edge_list(l, hashed=True, eprops=G.ep["edge_id"])
# construct a dict for mapping outside node-id's to internal id's (node indexes)
nd = {}
i = 0
for x in n:
nd[x] = i
i = i + 1
# construct a dict for mapping (source, target) combis to a cost and reverse cost
db_wgts = {}
with open('costs.txt') as f:
rows = f.readlines()
for row in rows:
source = int(row.split('\t')[0])
target = int(row.split('\t')[1])
cost = float(row.split('\t')[2])
reverse_cost = float(row.split('\t')[3])
db_wgts[(source, target)] = cost
db_wgts[(target, source)] = reverse_cost
# construct an edge property and fill it according to previous dict
wgts = G.new_edge_property("double")
i = 0
for e in G.edges():
i = i + 1
print i
print e
s = n[int(e.source())]
t = n[int(e.target())]
try:
wgts[e] = db_wgts[(s, t)]
except KeyError:
# this was necessary
wgts[e] = 1000000
# calculate shortest distance to all nodes within 0.1 total cost from source-node with outside-id of 102481678
dm = G.new_vp("double", np.inf)
gt.shortest_distance(G, source=nd[102481678], weights=wgts, dist_map = dm, max_dist=0.1)
# some mumbo-jumbo for getting the result in a nice node-id: cost format
ar = dm.get_array()
idxs = np.where(dm.get_array() < 0.1)
vals = ar[ar < 0.1]
final_res = [(i, v) for (i,v) in zip(list(idxs[0]), list(vals))]
final_res.sort(key=lambda tup: tup[1])
for x in final_res:
print n[x[0]], x[1]
# output saved in result_missing_nodes.txt
# 328 records, should be 349
To illustrate (one of the) missing nodes:
>>> dm[nd[63447311]]
0.0696234786274957
>>> dm[nd[106448775]]
0.06165528930577409
>>> dm[nd[127601733]]
inf
>>> dm[nd[100428293]]
0.0819900275163846
>>>
This doesn't seem possible because this is the local layout of the network, labels are the id's referenced above: