I want to get the number of walks between vertices v1
and v2
(paths with multiple visits of the same vertices allowed) in a graph.
There is a very neat algorithm outlines in Mark Newman's book - Networks: An Introduction (see this related math.SE question). Namely the number of walks of length n
between v1
and v2
, namely w(v1,v2)=A**n [v1,v2]
. (take the v1/v2 element of the n-th power of the adjacency matrix).
Now my problem is that my graph is enormous and i can only store it as sparse matrix. For that reason, i cannot compute the power of A
.
I tried to use networkx
, because I can create graphs from sparse matrices. But first, I only found how to compute simple paths (walks without revisiting vertices), second - it is very slow.
import time
import numpy as np
from numpy.linalg import norm
import networkx as nx
# Max Length of 4, paths/walks between vertex 0 and 1.
len_of_paths=4
pos1=0
pos2=1
# Create random adjacency matrix and graph
np.random.seed(6)
rnd_mat=np.random.rand(250,250)
np.fill_diagonal(rnd_mat,0.1)
A = np.floor(rnd_mat+rnd_mat.transpose())
G = nx.from_numpy_matrix(A)
# Compute all simple paths using networkx
time_start=time.time()
paths_between=list(nx.all_simple_paths(G, source=pos1, target=pos2, cutoff=len_of_paths))
time_end=time.time()
print('time NetworkX: ',(time_end-time_start))
# Compute all walks using matrix multiplications
# Not possible with sparse matrices?
time_start=time.time()
Amult=A
accumulated_paths=Amult[pos1,pos2];
for ii in range(len_of_paths-1):
Amult=np.dot(Amult,A)
individual_paths=Amult[pos1,pos2]
accumulated_paths+=individual_paths
time_end=time.time()
print('time MatMulti: ',(time_end-time_start))
print('Simple paths via networkx: ', len(paths_between))
print('Walks via matrix mult: ', accumulated_paths)
time NetworkX: 10.66906476020813
time MatMulti: 0.003000497817993164
So NetworkX is significantly slower.
Question: Is there any way to perform to speed the process up to estimate the number of simple paths or walks between two vertices in a Graph?