13

There's a gap in my knowledge but I'm not sure exactly where. Topological sorting can be done using depth first search, as wikipedia explains. However I've only seen depth first search implemented for trees, where as topological sort is for DAGs.

  1. is a tree a special case of a DAG where the implied direction of the edges is from the root node down
  2. is the algorithm used for topological sort not really doing DFS, only something a lot like it?

For example, topological sort can handle disconnected graphs where as DFS cannot traverse a node with no edges connecting it...can it?

Dominique Fortin
  • 2,212
  • 15
  • 20
Celeritas
  • 14,489
  • 36
  • 113
  • 194

2 Answers2

14

Because when used for topological sort you do a DFS on every nodes. If one of the children is already visited by a previous DFS (colored black). Then it is already pushed in the output vector and so you the dependency is already done.

Quoting your link (emphasis mine):

The algorithm loops through each node of the graph, in an arbitrary order, initiating a depth-first search ...

As the wikipedia article is a bit confusing (in my opinion) I will try to better describe the algorithm:

     V: set of vertices
     E: set of edges
     E.adj(v): set of vertices adjacent to vertex v

 0 function topological_sort(V,E):
 1   for v in V:
 2     paint v white
 3
 4   for v in V:
 5     if v is white:
 6       dfs(v)

 7 function dfs(v):
 8   paint v grey
 9   for child in E.adj(v)
10     if child is white:
11       dfs(child)
12   paint v black
13   push v to output

We can easily compute the complexity:

  • We paint a vertex white, grey and black once per vertex: O(V)
  • We check the color of the vertex at line 5 once per vertex: O(V)
  • We check the color of the vertex at line 10 once per edge: O(E)
  • We push the vertex to output at line 13 once per vertex: O(V)

So the final complexity is: O(V+E). It is pretty efficient.

One of the strength of this algorithm is that it doesn't need to modify the input graph. We can easily implement coloring by a temporary hashtable with a size in O(V). Some other topological sort algorithm requires to destroy the graph when they proceed (by removing edges). This would require to copy the graph before running the toplogical sort if you still need the graph after the sort.

fjardon
  • 7,921
  • 22
  • 31
  • What do you mean you do a DFS on every node? Are you saying for a graph with n nodes you'd run DFS n times? – Celeritas Apr 19 '16 at 08:11
  • Exactly. But, this is important, you stop going deep if you encounter a node already painted black by a previous DFS calls. – fjardon Apr 19 '16 at 08:13
  • Nonetheless this seems awfully inefficient to even initialize DFS n times. – Celeritas Apr 19 '16 at 08:14
  • You visit each node at most once, and each edge at most once. – fjardon Apr 19 '16 at 08:15
  • @Celeritas See my update for the complexity computation – fjardon Apr 19 '16 at 08:46
  • Thanks for the update. Correct me if I'm wrong, but wouldn't DFS implementations normally "reset" the visited state of the nodes after they return? It appears in your example the paint still stays on a node after `dfs()` ends. I guess what I'm asking/saying is: isn't a DFS implementation specific for topological sorting required, not any old DFS implementation? Let me know if that didn't make sense and I'll reword. – Celeritas Apr 19 '16 at 11:20
  • @Celeritas It really depends on how you store the color information. You can very well store colors in a hashtable, local to your topological sort function, in which case there is no need to clear colors ater the sort. If you store the color inside the graph then yes you will have to restore it to *their original values*. – fjardon Apr 19 '16 at 11:24
  • @fjardon Why do you paint `v` **grey** in line 8? Can you paint it black straight away? – kgf3JfUtW Jan 31 '17 at 16:58
  • 1
    @SamShen In this specific case it doesn't matter. Usually when you do a DFS you want to know the difference between a grey and black node. – fjardon Jan 31 '17 at 17:01
1

You might wish to try to add a new "source" node to your graph, and connect it with each and every other nodes with a directed edge. Then start your search/traverse from this new node. This approach might or might not suit your needs.

Matsmath
  • 1,164
  • 2
  • 21
  • 40