In my particular case, the graph is represented as an adjacency list and is undirected and sparse, n can be in the millions, and d is 3. Calculating A^d (where A is the adjacency matrix) and picking out the non-zero entries works, but I'd like something that doesn't involve matrix multiplication. A breadth-first search on every vertex is also an option, but it is slow.
-
As the graph is represented as an adjacency list, the depth-first search (up to d=3) should work. You needn't work on every vertex but just the vertex accessable. – Wei Apr 16 '11 at 14:14
-
Could you explain why breadth-first search from each vertex is slow? I would expect it to be the fastest way to find the data you need. – Peer Sommerlund May 10 '11 at 09:43
4 Answers
def find_d(graph, start, st, d=0):
if d == 0:
st.add(start)
else:
st.add(start)
for edge in graph[start]:
find_d(graph, edge, st, d-1)
return st
graph = { 1 : [2, 3],
2 : [1, 4, 5, 6],
3 : [1, 4],
4 : [2, 3, 5],
5 : [2, 4, 6],
6 : [2, 5]
}
print find_d(graph, 1, set(), 2)

- 10,547
- 28
- 104
- 208
Let's say that we have a function verticesWithin(d,x)
that finds all vertices within distance d
of vertex x
.
One good strategy for a problem such as this, to expose caching/memoisation opportunities, is to ask the question: How are the subproblems of this problem related to each other?
In this case, we can see that verticesWithin(d,x)
if d >= 1
is the union of vertices(d-1,y[i])
for all i
within range, where y=verticesWithin(1,x)
. If d == 0
then it's simply {x}
. (I'm assuming that a vertex is deemed to be of distance 0 from itself.)
In practice you'll want to look at the adjacency list for the case d == 1
, rather than using that relation, to avoid an infinite loop. You'll also want to avoid the redundancy of considering x
itself as a member of y
.
Also, if the return type of verticesWithin(d,x)
is changed from a simple list or set, to a list of d
sets representing increasing distance from x
, then
verticesWithin(d,x) = init(verticesWithin(d+1,x))
where init
is the function that yields all elements of a list except the last one. Obviously this would be a non-terminating recursive relation if transcribed literally into code, so you have to be a little bit clever about how you implement it.
Equipped with these relations between the subproblems, we can now cache the results of verticesWithin
, and use these cached results to avoid performing redundant traversals (albeit at the cost of performing some set operations - I'm not entirely sure that this is a win). I'll leave it as an exercise to fill in the implementation details.

- 32,079
- 16
- 104
- 187
You already mention the option of calculating A^d
, but this is much, much more than you need (as you already remark).
There is, however, a much cheaper way of using this idea. Suppose you have a (column) vector v
of zeros and ones, representing a set of vertices. The vector w := A v
now has a one at every node that can be reached from the starting node in exactly one step. Iterating, u := A w
has a one for every node you can reach from the starting node in exactly two steps, etc.
For d=3
, you could do the following (MATLAB pseudo-code):
v = j'th unit vector
w = v
for i = (1:d)
v = A*v
w = w + v
end
the vector w
now has a positive entry for each node that can be accessed from the j
th node in at most d
steps.

- 5,471
- 4
- 37
- 50
-
I don't see how this is any cheaper (time-wise) than matrix multiplication since this process must be carried out n times. Breadth-first search is a better option for a single starting vertex anyways. Also there is no need for your vector `w`. `A^d * v` (calculated iteratively or otherwise) has the same non-zero elements as `w`. – Robert D Apr 19 '11 at 22:48
-
You're right, I thought you wanted to start from one vertex. However, please not that there is no matrix-matrix product in this approach, only matrix-vector products (which are much cheaper to compute). – Martijn Apr 20 '11 at 06:16
Breadth first search starting with the given vertex is an optimal solution in this case. You will find all the vertices that within the distance d, and you will never even visit any vertices with distance >= d + 2.
Here is recursive code, although recursion can be easily done away with if so desired by using a queue.
// Returns a Set
Set<Node> getNodesWithinDist(Node x, int d)
{
Set<Node> s = new HashSet<Node>(); // our return value
if (d == 0) {
s.add(x);
} else {
for (Node y: adjList(x)) {
s.addAll(getNodesWithinDist(y,d-1);
}
}
return s;
}

- 189
- 12
-
This is the best solution for 1 starting vertex, but the problem is to do this for every vertex. – Robert D Apr 19 '11 at 20:14