Distance map returned from shortest_distance function misses entries of certain vertices

Question

I have a network present in a postgres database, where I can route with the pgrouting extension. I've read this into mem, and now want to calculate the distance of all nodes within 0.1 hours from a specific starting node:

dm = G.new_vp("double", np.inf)
gt.shortest_distance(G, source=nd[102481678], weights=wgts, dist_map = dm, max_dist=0.1)

where wgts is an EdgePropertyMap containing the weights per edge, and nd is a reverse mapping to get vertex index from the outside id.

In pgRouting this delivers 349 reachable nodes, using graph-tool only 328. The results are more or less the same (e.g. the furthest node is the same with the exact same cost, nodes present in both lists have same distance), but the graph-tool distance map just seems to miss certain nodes. The weird thing is that I found a cul-de-sac node labeled with a distance (second one from below), but the node connecting the cul-de-sac with the outside world is missing. Seems weird, because if the connecting node would not be reachable, the cul-de-sac would be unreachable as well.

I've compiled a MWE: https://gofile.io/d/YpgjSw

Below is the python code:

import graph_tool.all as gt
import numpy as np
import time

# construct list of source, target, edge-id (edge-id not really used in this example)
l = []
with open('nw.txt') as f:
    rows = f.readlines()
    for row in rows:
        id = int(row.split('\t')[0])
        source = int(row.split('\t')[1])
        target = int(row.split('\t')[2])
        l.append([source, target, id])
        l.append([target, source, id])

print len(l)

# construct graph
G = gt.Graph(directed=True)
G.ep["edge_id"] = G.new_edge_property("int")
n = G.add_edge_list(l, hashed=True, eprops=G.ep["edge_id"])

# construct a dict for mapping outside node-id's to internal id's (node indexes)
nd = {}
i = 0
for x in n:
    nd[x] = i
    i = i + 1

# construct a dict for mapping (source, target) combis to a cost and reverse cost
db_wgts = {}
with open('costs.txt') as f:
    rows = f.readlines()
    for row in rows:
        source = int(row.split('\t')[0])
        target = int(row.split('\t')[1])
        cost = float(row.split('\t')[2])
        reverse_cost = float(row.split('\t')[3])
        db_wgts[(source, target)] = cost
        db_wgts[(target, source)] = reverse_cost

# construct an edge property and fill it according to previous dict
wgts = G.new_edge_property("double")

i = 0
for e in G.edges():
    i = i + 1
    print i
    print e
    s = n[int(e.source())]
    t = n[int(e.target())]
    try:
        wgts[e] = db_wgts[(s, t)]
    except KeyError:
        # this was necessary
        wgts[e] = 1000000


# calculate shortest distance to all nodes within 0.1 total cost from source-node with outside-id of 102481678
dm = G.new_vp("double", np.inf)
gt.shortest_distance(G, source=nd[102481678], weights=wgts, dist_map = dm, max_dist=0.1)

# some mumbo-jumbo for getting the result in a nice node-id: cost format
ar = dm.get_array()
idxs = np.where(dm.get_array() < 0.1)
vals = ar[ar < 0.1]
final_res = [(i, v) for (i,v) in zip(list(idxs[0]), list(vals))]
final_res.sort(key=lambda tup: tup[1])  
for x in final_res:
    print n[x[0]], x[1]
# output saved in result_missing_nodes.txt
# 328 records, should be 349

To illustrate (one of the) missing nodes:

>>> dm[nd[63447311]]
0.0696234786274957
>>> dm[nd[106448775]]
0.06165528930577409
>>> dm[nd[127601733]]
inf
>>> dm[nd[100428293]]
0.0819900275163846
>>>

This doesn't seem possible because this is the local layout of the network, labels are the id's referenced above:

Could you please provide a minimal and complete (self-contained) program that shows the problem? — Tiago Peixoto, May 22 '20 at 16:08
Sure. Thank for trying to help me out. I've expanded my question with a MWE. — Mathias Versichele, May 23 '20 at 19:41
Tried it with latest version of graph-tool (docker container): same issue. — Mathias Versichele, May 26 '20 at 12:44

score 0 · Accepted Answer · answered Jun 04 '20 at 10:19

This is a numerical precision problem. You have very low edge weights (1e-6) combined with very large values (1000000), which cause differences to be lost to finite precision. If you replace all values 1000000 (which I assume mean infinite weight) by numpy.inf, you actually get a more stable calculation, and no missing nodes in your example.

An even better alternative is to actually remove the "infinite weight" edges using an edge filter:

u = GraphView(G, efilt=wgts.fa < 1000000)

and compute the distances on that.

The numpy.inf wasn't enough, but the GraphView did the trick. Thx ! — Mathias Versichele, Jun 08 '20 at 10:49

Distance map returned from shortest_distance function misses entries of certain vertices

1 Answers1