Compute distance using DFS

Question

I was torn between these two methods:

M1:

Use adjacency list to represent graph G with vertices P and edges A
Use DFS on G storing all the distances from p in an array d;
Loop through d checking all entries. If some d[u] >6, return false otherwise true

M2:

Use adjacency list to represent graph G with vertices P and edges A
Use BFS on G storing all the distances from p in an array d;
Loop through d checking all entries. If some d[u] >6, return false otherwise true

Both these methods will produce a worst case O(|P| + |A|), therefore I think that both would be a correct answer to this question. I had chosen the DFS method, with the reasoning that with DFS you should be able to find the "outlier" of freedom degree 7 earlier than with BFS, since with BFS you would have to traverse every single Vertex until degree 7 in every case.

Apparently this is wrong according to the teacher, as using DFS, you can't compute the distances. I don't understand why you wouldn't be able to compute the distances. I could have a number n indicating the degree of freedom I am currently at. Starting from root p, the child would have n = 1. Now I store n in array d. Then I keep traversing down until no child is to be found, while incrementing n and storing the value in my array d. Then, if the back-tracking starts, the value n will be decremented until we find an unvisited child node of any of the visited nodes on the stack. If there is an unvisited child, increment once again, then increment until no more child is found, decrement until the next unvisited child from the stack is found...

I believe that would be a way to store the distances with DFS

**DO NOT post images of code, data, error messages, etc.** - copy or type the text into the question. [ask] — Rob, Jun 04 '20 at 22:46

trincot · Accepted Answer · 2020-06-06T17:44:02.650

Both BFS and DFS can do the job: they can both limit their search to a depth of 6, and at the end of the traversal they can check whether the whole population was reached or not. But there are some important differences:

With BFS

The BFS traversal is the algorithm I would opt for. When a BFS search determines the degree of a person, it is definitive: no correction needs to be made to it.

Here is sketch of how you can do this with BFS:

visited = set()  # empty set
frontier = []  # empty array
visited.add(p)  # search starts at person p
frontier.append(p)
for degree in [1, 2, 3, 4, 5, 6]:
    nextFrontier = []  # empty array
    for person in frontier:
        for acquaintance in A[person]:
            if acquaintance not in visited:
                visited.add(acquaintance)
                nextFrontier.append(acquaintance)
    frontier = nextFrontier
    if size(visited) == size(P):  # have we reached the whole population?
        return True
# After six rounds we did not reach all people, so...
return False

This assumes that you can find the list of acquaintances for a given person via A[person]. If A is not structured like an adjacency list but as a list of pairs, then first do some preprocessing on the original A to create such an adjacency list.

With DFS

A DFS algorithm has as downside that it will not necessarily start with optimal paths, and so it will find that some persons have degree 6, while there really are shorter, uninvestigated paths that could improve on that degree. This means that a DFS algorithm may need to revisit nodes and even partial paths (edges) to register such improvements and cascade them through a visited path up to degree 6. And there might even be several improvements to be applied for the same person.

A DFS algorithm could look like this:

degreeOfPerson = dict()  # empty key/value dictionary
for person in P:
    degreeOfPerson[person] = 7  # some value greater than 6

function dfs(person, degree):
    if degree >= 7:
        return  # don't lose time for higher degrees than 6.
    for acquaintance in A[person]:
        if degree < degreeOfPerson[acquaintance]:  # improvement?
            degreeOfPerson[acquaintance] = degree
            dfs(acquaintance, degree+1)

# start DFS
degreeOfPerson[p] = 0
dfs(p, 1)

# Check if all persons got a degree of maximum 6
for person in P:
    if degreeOfPerson[person] > 6:
        return False
return True

Example

If the graph has three nodes, linked as a triangle a-b-c, with starting point a, then this would be the sequence. Indentation means (recursive) call of dfs:

degreeOfPerson[a] = 0
    a->b: degreeOfPerson[b] = 1
        b->c: degreeOfPerson[c] = 2
            c->a: # cannot improve degreeOfPerson[a]. Backtrack
            c->b: # cannot improve degreeOfPerson[b]. Backtrack
        b->a: # cannot improve degreeOfPerson[a]. Backtrack
    a->c: degreeOfPerson[c] = 1  # improvement!
        c->a: # cannot improve degreeOfPerson[a]. Backtrack
        c->b: # cannot improve degreeOfPerson[b]. Backtrack

Time Complexity

The number of times the same edge can be visited with DFS is not more than the maximum degree we are looking for -- in your case 6. If that is a constant, then it does not affect the time complexity. If however the degree to check for is an input value, then the time complexity of DFS becomes O(maxdegree * |E| + |V|).

Because of how DFS would have to find the shortest path to every Node, does that mean that the complexity would also no longer be O(|V| + |E|)? — glera, Jun 06 '20 at 14:42
I will add pseudo code for a DFS algorithm to my answer with some comments. — trincot, Jun 06 '20 at 15:03
Thank you very much. I am having a little bit of trouble to understand this implementation, would this implementation be able to detect the effective shortest paths from our source node person p to all other nodes? For example if we had an undirected triangle graph and source node A. Distance from A to B and C are both 1, would it store (1,b) (1,c) in degreeOfPerson? — glera, Jun 06 '20 at 17:25
Oh thanks, I think I got it. It looks for every possible path from a to x and "improves" (saves) the new Degree if a shorter path has been found. Would the Complexity also stay O(|V| + |E|) with a constant degree, if I were to use an adjacency list? — glera, Jun 06 '20 at 18:09
Yes, that is the time complexity if you have an adjacency list (not an adjacency matrix). Be sure that you can find the appropriate list in O(1) time, given the node you want the neighbors of. — trincot, Jun 06 '20 at 18:27

Ekesh Kumar · Answer 2 · 2020-06-04T22:56:38.710

A simple depth-first search algorithm does not necessary yield the shortest path in an undirected graph. For example, consider a simple triangle graph. If you start at one vertex, you will process the other two vertices. A naive algorithm will find that there is one vertex whose distance equals one away from the source, and a second vertex whose distance equals two away from the source. However, this is incorrect since the distance from the source to either vertex is actually one.

A much more natural approach is to use the breadth-first search (BFS) algorithm. It can be shown that a breadth-first search computes shortest paths, and it requires significantly fewer modifications.

You definitely can use depth-first search to compute the distances from one node to another, but it is not a natural approach. In fact, it is very common to miscompute distances using a depth-first search algorithm (see: http://www-student.cse.buffalo.edu/~atri/cse331/support/dfs-bfs/index.html), particularly when the underlying graph has cycles. There are some special cases you must handle if you want to do it this way, but it definitely is possible.

With that being said, the depth-first search algorithm you describe does not appear to be correct. For example, it will fail on the triangle graph that I described above. This is true because the standard depth-first search only visits each vertex once, and you would not revisit a vertex after its distance has been set. Thus, if you take the "longer path" to a vertex in a cycle at first, you will end up with an incorrect distance value.

Thanks, I haven't thought about cycles yet. What I understand now is that DFS would be trickier to implement to compute distances than BFS for this example, but that it is still possible to calculate the correct distances with DFS. — glera, Jun 04 '20 at 22:57
I've been having a go at implement a correct algorithm for this, but haven't been able to. Do you maybe a working implementation for this prblem? — glera, Jun 05 '20 at 10:58

Compute distance using DFS

2 Answers2

With BFS

With DFS

Example

Time Complexity