7

Topological sort can be done using both a DFS(having edges reversed) and also using a queue . A BFS can also be done using a queue . Is there any relationship between the way elements are stored and retrieved while using queue for a BFS to that when used a queue for topological sorting . Clarification will be helpful . Thanks.

motiur
  • 1,640
  • 9
  • 33
  • 61
  • `` chp 11.3 described this very clearly. – Eric Feb 04 '15 at 07:32
  • Related posts - [Using BFS for topological sort](https://stackoverflow.com/q/25229624/465053), [Topological search and Breadth first search](https://stackoverflow.com/q/14906533/465053), & [Topological order using bfs](https://stackoverflow.com/q/30869987/465053) – RBT Jul 13 '18 at 00:12

2 Answers2

3

The level by level traversal of BFS from a source node makes the nodes appear in the order of their distance from source which also means that parent nodes appear before their children nodes which are in the next level.

This might appear like what we need in a topological sort, however, stay with me. The next level in the previous sentence is the key, because if a node and its child are in the same level from source, then BFS enforces no order in traversing them meaning it may present the node's child before itself and this will be a direct violation to the rule of topological sort and the ordering is indeed important when we want a topological sort.

Although it seems like there is a relationship between BFS and topological sort, it is rather weak.

Sarath M
  • 386
  • 2
  • 4
2

No, there is not necessarily any relationship. I assume you are referring to the algorithm by Kahn from wikipedia/Topological_sorting#Algorithms, which wikipedia notes:

Note that, reflecting the non-uniqueness of the resulting sort, the structure S can be simply a set or a queue or a stack.

Thus the "queue" for topological sorting is really "any collection" structure and the ordering in this collection does not matter; it can be anything. The queue used for BFS on the other hand, is all about the order; so that it can accomplish its FIFO (first-in, first-out) task. Changing this ordering will ruin the BFS algorithm.

There might be other "queue" based algorithms for topological sort, where it does matter that the structure is a queue. If you are asking about a particular such algorithm, please clarify.

EDIT: Algorithm of interest is clarified to be Improved algorithm section, which is the same as Kahn's.

EDIT: I've written some code that implements topological sort according to the Improved algorithm section in the page you linked. I made the type of collection it uses arbitrary as an argument of the sort function. I then make a few types of such collections, including a stack, a queue, a random-pop-collection and a python set (its a hashset, so no guarantees on order).

I then make a graph, and test the sorting algorithm on it with each collection. Then I test each of the results using the definition listed on wikipedia of topological sort:

.. a topological sort (sometimes abbreviated topsort or toposort) or topological ordering of a directed graph is a linear ordering of its vertices such that, for every edge uv, u comes before v in the ordering.

wikipedia

The code is written in python and follows. The result is here from http://ideone.com. I don't know a good easy way to generate random DAGs for testing so my test graph is lame. Feel free to comment/edit a good DAG generator.

EDIT: Now I have a less lame generator, but it uses networkx. The function nx_generate_random_dag is in the code, but it imports networkx in the function. You can uncomment the marked section in main to generate graphs. I hardcoded a generated graph into the code, so we get more interesting results.

All of this is to show, that the ordering of the "collection" data structure (the queue in the algorithm) can be in any order.

from collections import deque
import random


def is_topsorted(V,E,sequence):
  sequence = list(sequence)
  #from wikipedia definition of top-sort
  #for every edge uv, u comes before v in the ordering
  for u,v in E:
    ui = sequence.index(u)
    vi = sequence.index(v)
    if not (ui < vi):
      return False
  return True 

#the collection_type should behave like a set:
# it must have add(), pop() and __len__() as members.
def topsort(V,E,collection_type):
  #out edges
  INS = {}

  #in edges
  OUTS = {}
  for v in V:
    INS[v] = set()
    OUTS[v] = set()

  #for each edge u,v,
  for u,v in E:
    #record the out-edge from u
    OUTS[u].add(v)
    #record the in-edge to v
    INS[v].add(u)

  #1. Store all vertices with indegree 0 in a queue
  #We will start
  topvertices = collection_type()

  for v,in_vertices in INS.iteritems():
    if len(in_vertices) == 0:
      topvertices.add(v)

  result = []

  #4. Perform steps 2 and 3 while the queue is not empty.
  while len(topvertices) != 0:  
    #2. get a vertex U and place it in the sorted sequence (array or another queue).
    u = topvertices.pop()
    result.append(u)

    #3. For all edges (U,V) update the indegree of V,
    # and put V in the queue if the updated indegree is 0.

    for v in OUTS[u]:
      INS[v].remove(u)
      if len(INS[v]) == 0:
        topvertices.add(v)

  return result

class stack_collection:
  def __init__(self):
    self.data = list()
  def add(self,v):
    self.data.append(v)
  def pop(self):
    return self.data.pop()
  def __len__(self):
    return len(self.data)

class queue_collection:
  def __init__(self):
    self.data = deque()
  def add(self,v):
    self.data.append(v)
  def pop(self):
    return self.data.popleft()
  def __len__(self):
    return len(self.data)

class random_orderd_collection:
  def __init__(self):
    self.data = []
  def add(self,v):
    self.data.append(v)
  def pop(self):    
    result = random.choice(self.data)
    self.data.remove(result)
    return result
  def __len__(self):
    return len(self.data)

"""
Poor man's graph generator.
Requires networkx.

Don't make the edge_count too high compared with the vertex count,
 otherwise it will run for a long time or forever.
"""
def nx_generate_random_dag(vertex_count,edge_count):
  import networkx as nx

  V = range(1,vertex_count+1)
  random.shuffle(V)

  G = nx.DiGraph()
  G.add_nodes_from(V)

  while nx.number_of_edges(G) < edge_count:

    u = random.choice(V)
    v = random.choice(V)
    if u == v:
      continue

    for tries in range(2):
      G.add_edge(u,v)
      if not nx.is_directed_acyclic_graph(G):
        G.remove_edge(u,v)
        u,v = v,u
  V = G.nodes()
  E = G.edges()

  assert len(E) == edge_count
  assert len(V) == vertex_count
  return V,E




def main():

  graphs = []

  V = [1,2,3,4,5]
  E = [(1,2),(1,5),(1,4),(2,4),(2,5),(3,4),(3,5)]

  graphs.append((V,E))

  """
  Uncomment this section if you have networkx.
  This will generate 3 random graphs.
  """
  """
  for i in range(3):
    G = nx_generate_random_dag(30,120)
    V,E = G
    print 'random E:',E
    graphs.append(G)
  """


  #This graph was generated using nx_generate_random_dag() from above
  V = range(1,31)
  E = [(1, 10), (1, 11), (1, 14), (1, 17), (1, 18), (1, 21), (1, 23),
       (1, 30), (2, 4), (2, 12), (2, 15), (2, 17), (2, 18), (2, 19),
       (2, 25), (3, 22), (4, 5), (4, 8), (4, 22), (4, 23), (4, 26),
       (5, 27), (5, 23), (6, 24), (6, 28), (6, 27), (6, 20), (6, 29),
       (7, 3), (7, 19), (7, 13), (8, 24), (8, 10), (8, 3), (8, 12),
       (9, 4), (9, 8), (9, 10), (9, 14), (9, 19), (9, 27), (9, 28),
       (9, 29), (10, 18), (10, 5), (10, 23), (11, 27), (11, 5),
       (12, 10), (13, 9), (13, 26), (13, 3), (13, 12), (13, 6), (14, 24),
       (14, 28), (14, 18), (14, 20), (15, 3), (15, 12), (15, 17), (15, 19),
       (15, 25), (15, 27), (16, 4), (16, 5), (16, 8), (16, 18), (16, 20), (16, 23),
       (16, 26), (16, 28), (17, 4), (17, 5), (17, 8), (17, 12), (17, 22), (17, 28),
       (18, 11), (18, 3), (19, 10), (19, 18), (19, 5), (19, 22), (20, 5), (20, 29),
       (21, 25), (21, 12), (21, 30), (21, 17), (22, 11), (24, 3), (24, 10),
       (24, 11), (24, 28), (25, 10), (25, 17), (25, 23), (25, 27), (26, 3),
       (26, 18), (26, 19), (28, 26), (28, 11), (28, 23), (29, 2), (29, 4),
       (29, 11), (29, 15), (29, 17), (29, 22), (29, 23), (30, 3), (30, 7),
       (30, 17), (30, 20), (30, 25), (30, 26), (30, 28), (30, 29)]

  graphs.append((V,E))

  #add other graphs here for testing


  for G in graphs:
    V,E = G

    #sets in python are unordered but in practice their hashes usually order integers.
    top_set = topsort(V,E,set)

    top_stack = topsort(V,E,stack_collection)

    top_queue = topsort(V,E,queue_collection)

    random_results = []
    for i in range(0,10):
      random_results.append(topsort(V,E,random_orderd_collection))

    print
    print 'V: ', V
    print 'E: ', E
    print 'top_set ({0}): {1}'.format(is_topsorted(V,E,top_set),top_set)
    print 'top_stack ({0}): {1}'.format(is_topsorted(V,E,top_stack),top_stack)
    print 'top_queue ({0}): {1}'.format(is_topsorted(V,E,top_queue),top_queue)

    for random_result in random_results:
      print 'random_result ({0}): {1}'.format(is_topsorted(V,E,random_result),random_result)
      assert is_topsorted(V,E,random_result)

    assert is_topsorted(V,E,top_set)
    assert is_topsorted(V,E,top_stack)
    assert is_topsorted(V,E,top_queue)



main()
Realz Slaw
  • 3,138
  • 1
  • 24
  • 38
  • 1
    There is a description given about using performing topological sort using a queue . The concept of using indegree and outdegree is similar to the concept of going through one layer after another in BFS(Breadth first search) . BTW : I think you meant BFS instead of DFS above . http://faculty.simpson.edu/lydia.sinapova/www/cmsc250/LN250_Weiss/L20-TopSort.htm – motiur Sep 12 '12 at 14:59
  • Section 4. Improved algorithm is the exact same algorithm described by Kahn. `Store all vertices with indegree 0 in a queue` << the ordering to this structure doesn't matter. `array or another queue` << this ordering does matter, but this is simply the result of the sort, so I am sure this is not the queue that you meant. – Realz Slaw Sep 12 '12 at 15:05
  • 1
    Yes , I am concentrating on the fact that vertices having the same indegree goes first , then their child goes (similar in a BFS , the children can be considered as second level , the parent as the first level); then the parents are knocked out ; similar to a FIFO concept used in BFS . Doesn't it sound similar to the algorithm of BFS? – motiur Sep 12 '12 at 15:14
  • There is similarity between them; BFS and this algorithm traverse the graph "level" by "level". However, BFS cares about the order within the level. Topological sort does not. As a side note: I think BFS might not work too well on an actual DAG since there can be multiple "roots". Good question btw, it makes for good clarification. – Realz Slaw Sep 12 '12 at 15:24
  • "BFS cares about the order within the level. Topological sort does not" ; can you give a good citation for this fact . And also , I think , I agree that there is a concern about a topological sorting having multiple roots ; where as BFS generally have a single root . – motiur Sep 12 '12 at 15:33
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/16610/discussion-between-motiur-rahman-and-realz-slaw) – motiur Sep 12 '12 at 15:36