0

I'm learning about graph search and tree search recently and I see many examples mention something like "the path returned by xxx graph search is..." However, graph search is a traversal method, how can it return a path?

We know in tree search, each node corresponds to a particular path and by visiting a node, we know the corresponding path. I'm familiar with tree search enough, but, in graph search, the intention is to visit the destination, rather than find a path to the destination. Then, how can we define a path returned by graph search?

For example, below is a graph.

Graph
       B
      / \
     /   \
Start     Destination
     \   /
      \ /
       C

When we do BFS on start, the visiting sequence is Start-B-C-Destination or Start-C-B-Destination. Then, what should be the returned path? If we are using a queue, we can say we enqueue Destination when visiting B (or C) so the returned path is Start-B-Destination (or Start-C-Destination), then what about the following graph?

Graph
       B
      / \
     /   \
Start-----D-----Destination
     \   /
      \ /
       C

Clearly, D is enqueued when visiting Start, so the only possible path should be Start-D-Destination?

There are more problems in A* search. If a node is visited not from the shortest path (due to poor heuristic function), then what should be the final returned path? For example, in the below graph, the visiting sequence is Start-B-D-C-Destination, then what should be the returned path? Start-B-D-Destination or Start-C-D-Destination? We visited D when B is in the fringe, however, when we calculate the cost from Start to D, we are calculating Start-C-D and the cost should be 2 rather than 3.

Graph
        B
       / \
      1   2
     /     \
Start       D--5--Destination
     \     /
      1   1
       \ /
        C

h(B) = 2, h(C) = 4, h(D) = 1

I think all these problems are caused by the intention of graph search isn’t to find a path and graph search will lead to ambiguity when selecting which node should be included in the final path.

Thanks for you patience and I’ll appreciate it a lot if anyone can give me an answer.

citrate
  • 189
  • 1
  • 1
  • 8
  • Suggestion: read about the Dijsktra's algorithm https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm – ravenspoint May 16 '21 at 14:57
  • Dijkstra's algorithm, as a graph search algorithm, can return a path because we know how the shortest path to a particular node is obtained. The previous node of A in a path is the node that updates A's distance. In cases like BFS, there's no such relationship. – citrate May 16 '21 at 15:07

2 Answers2

2

I will use your example:

        B
       / \
      1   2
     /     \
Start       D--5--Destination
     \     /
      1   1
       \ /
        C

Note that a pure Breadth-First Search pays no attention to edge weights (or cost). In this case, the first path that takes it to the destination might be Start->B->D->Destination.

It marks each node it visits with a boolean flag to indicate that the node has been visited, so that it doesn't pursue Start->B->Start->C->Start->... In order to remember the path that succeeded, we must store a little more information somewhere. One simple approach is to mark each node, not just with the visited flag, but with a reference to the node from which it was visited. So if the visitation order was Start, B, C, D, Destination, the graph will look like this:

        B(Start)
       / \
      1   2
     /     \
Start       D(B)--5--Destination(D)
     \     /
      1   1
       \ /
        C(Start)

Then we can backtrack. How did we get to Destination? From D. How did we get to D? From B. How did we get to B? from Start.

Breadth-First Search follows the first edge in the queue, and finds the shortest path (in the sense of smallest number of edges). If we want the least-weight (cheapest) path, we modify the search to keep track of cost-to-get-to-this-node, and follow the edge that reaches a new node most cheaply. (This is one version of Dijkstra's Algorithm.) Then in this case the visitation order will be the same, but we will visit D from C, not from B, and get the path Start->C->D->Destination.

With A*, using h(B) = 2, h(C) = 4, h(D) = 1, the visitation order is Start, B, D, C, D, Destination, and the path is Start->C->D->Destination.

Beta
  • 96,650
  • 16
  • 149
  • 150
  • Thanks for your explanation! I think I understand most of the content. So, according to your answer, in my second graph, if we are using BFS, the only possible path is `Start-D-Destination` and if we are using DFS, all three paths are possible right? – citrate May 17 '21 at 01:49
1

For BFS, there's two options:

  1. When you enqueue a node to the queue, mark which neighbor you're coming from. Then at the end, every node on the best path will have a link to the "previous" node on the path, so you can backtrack directly from the Destination to the Start to find the path.

  2. When you enqueue a node to the queue, make note of the best path to that node, which is just the best path to its neighbor, plus the edge you took.

Option 1 is what's usually done because it's more performant.


For Dijkstra's (BFS for weighted graphs), the same is true, except that when you enqueue a node to the queue and note which neighbor you came from, you haven't necessarily found the best path to that node. If you later encounter the same node from a different neighbor, you need to compare the the total distances of both paths, and choose the shorter one as the "previous" node.

This is done in lines 18-20 of the pseudocode on Wikipedia

18  if alt < dist[v]:              
19    dist[v] ← alt
20    prev[v] ← u

Once you expand the node (remove it from the priority queue), you're guaranteed to have found the best path to the node.


For A*, the the situation is the same as Dijkstra's if the heuristic is consistent. However, if the heuristic is only admissible, then you're not guaranteed to have found the best path to a node when you first expand it. In this case, you may need to enqueue and expand the same node multiple times.

For this reason, it's actually not very common to use A* with a non-consistent heuristic.

BlueRaja - Danny Pflughoeft
  • 84,206
  • 33
  • 197
  • 283
  • 1
    Thanks! I think I didn't understand A* search correctly for I didn't recognize that a node can be visited multiple times in A* search. – citrate May 17 '21 at 01:48
  • Does it mean A* search can finally return the shortest path even if the heuristic function is not consistent? – citrate May 17 '21 at 04:26
  • 1
    @citrate: If the heuristic is consistent, A\* is guaranteed to return the shortest path, and whenever a node is expanded it's guaranteed to be the shortest path to that node. If the heuristic is admissible but not consistent, then A\* is guaranteed to return the shortest path, but nodes may need to be expanded multiple times. If the heuristic is not admissible, it might not return the shortest path _(this is actually used as an advantage is some other algorithms, but that's another discussion)_ – BlueRaja - Danny Pflughoeft May 17 '21 at 18:56