For every vertex in a graph, find all vertices within a distance d

Question

In my particular case, the graph is represented as an adjacency list and is undirected and sparse, n can be in the millions, and d is 3. Calculating A^d (where A is the adjacency matrix) and picking out the non-zero entries works, but I'd like something that doesn't involve matrix multiplication. A breadth-first search on every vertex is also an option, but it is slow.

As the graph is represented as an adjacency list, the depth-first search (up to d=3） should work. You needn't work on every vertex but just the vertex accessable. — Wei, Apr 16 '11 at 14:14
Could you explain why breadth-first search from each vertex is slow? I would expect it to be the fastest way to find the data you need. — Peer Sommerlund, May 10 '11 at 09:43

score 1 · Answer 1 · answered Oct 11 '13 at 18:05

def find_d(graph, start, st, d=0):

    if d == 0:
        st.add(start)
    else:
        st.add(start)
        for edge in graph[start]:
            find_d(graph, edge, st, d-1)

    return st

graph = { 1 : [2, 3],
      2 : [1, 4, 5, 6],
      3 : [1, 4],
      4 : [2, 3, 5],
      5 : [2, 4, 6],
      6 : [2, 5]
    }

print find_d(graph, 1, set(), 2)

Robin Green · Answer 2 · 2011-04-16T09:25:13.847

Let's say that we have a function verticesWithin(d,x) that finds all vertices within distance d of vertex x.

One good strategy for a problem such as this, to expose caching/memoisation opportunities, is to ask the question: How are the subproblems of this problem related to each other?

In this case, we can see that verticesWithin(d,x) if d >= 1 is the union of vertices(d-1,y[i]) for all i within range, where y=verticesWithin(1,x). If d == 0 then it's simply {x}. (I'm assuming that a vertex is deemed to be of distance 0 from itself.)

In practice you'll want to look at the adjacency list for the case d == 1, rather than using that relation, to avoid an infinite loop. You'll also want to avoid the redundancy of considering x itself as a member of y.

Also, if the return type of verticesWithin(d,x) is changed from a simple list or set, to a list of d sets representing increasing distance from x, then

verticesWithin(d,x) = init(verticesWithin(d+1,x))

where init is the function that yields all elements of a list except the last one. Obviously this would be a non-terminating recursive relation if transcribed literally into code, so you have to be a little bit clever about how you implement it.

Equipped with these relations between the subproblems, we can now cache the results of verticesWithin, and use these cached results to avoid performing redundant traversals (albeit at the cost of performing some set operations - I'm not entirely sure that this is a win). I'll leave it as an exercise to fill in the implementation details.

score 0 · Answer 3 · answered Apr 19 '11 at 11:25

0

You already mention the option of calculating A^d, but this is much, much more than you need (as you already remark).

There is, however, a much cheaper way of using this idea. Suppose you have a (column) vector v of zeros and ones, representing a set of vertices. The vector w := A v now has a one at every node that can be reached from the starting node in exactly one step. Iterating, u := A w has a one for every node you can reach from the starting node in exactly two steps, etc.

For d=3, you could do the following (MATLAB pseudo-code):

v = j'th unit vector
w = v
for i = (1:d)
   v = A*v
   w = w + v
end

the vector w now has a positive entry for each node that can be accessed from the jth node in at most d steps.

answered Apr 19 '11 at 11:25

Martijn

5,471
4
37
50

I don't see how this is any cheaper (time-wise) than matrix multiplication since this process must be carried out n times. Breadth-first search is a better option for a single starting vertex anyways. Also there is no need for your vector `w`. `A^d * v` (calculated iteratively or otherwise) has the same non-zero elements as `w`. – Robert D Apr 19 '11 at 22:48
You're right, I thought you wanted to start from one vertex. However, please not that there is no matrix-matrix product in this approach, only matrix-vector products (which are much cheaper to compute). – Martijn Apr 20 '11 at 06:16

Josh · Answer 4 · 2011-04-19T12:03:00.297

0

Breadth first search starting with the given vertex is an optimal solution in this case. You will find all the vertices that within the distance d, and you will never even visit any vertices with distance >= d + 2.

Here is recursive code, although recursion can be easily done away with if so desired by using a queue.

// Returns a Set
Set<Node> getNodesWithinDist(Node x, int d)
{
  Set<Node> s = new HashSet<Node>();  // our return value
  if (d == 0) {
    s.add(x);
  } else {
    for (Node y: adjList(x)) {
       s.addAll(getNodesWithinDist(y,d-1);
    }
  }
  return s;
}

edited Apr 19 '11 at 12:03

answered Apr 19 '11 at 11:56

Josh

189
12

This is the best solution for 1 starting vertex, but the problem is to do this for every vertex. – Robert D Apr 19 '11 at 20:14

For every vertex in a graph, find all vertices within a distance d

4 Answers4