5

I have this kind of Directed Acyclic Graph with multiple roots:

DiGraph

And I need to get a list with nodes sorted by directions and grouped by steps, like this:

ordering = [
    [1, 3],
    [2],
    [4],
    [5, 6],
    [7]
]

Maybe there is some ready algorithm for this? I tried networkx.algorithm but all of them can return me only a flat list without grouping by steps.

Taras Protsenko
  • 545
  • 1
  • 6
  • 15
  • @Rahul, you can't get the node without all shes predecessors, that only one rule I used in ordering. For example node 2 you can't get before 1. So on the first step, you can get only 1 and 3 nodes, on the second you can get only 2 because 4 depends on 2 and 3. – Taras Protsenko Jun 28 '19 at 08:39

5 Answers5

9

nx.topological_sort almost does what you want; the only difference is that it doesn't group items that enter the queue at the same time, but it's straightforward to adapt the source to make it do so:

def topological_sort_grouped(G):
    indegree_map = {v: d for v, d in G.in_degree() if d > 0}
    zero_indegree = [v for v, d in G.in_degree() if d == 0]
    while zero_indegree:
        yield zero_indegree
        new_zero_indegree = []
        for v in zero_indegree:
            for _, child in G.edges(v):
                indegree_map[child] -= 1
                if not indegree_map[child]:
                    new_zero_indegree.append(child)
        zero_indegree = new_zero_indegree

With your example:

In [21]: list(nx.topological_sort(G))
Out[21]: [3, 1, 2, 4, 6, 7, 5]

In [22]: list(topological_sort_grouped(G))
Out[22]: [[1, 3], [2], [4], [5, 6], [7]]

In practice, I have to wonder if there's a situation where this construction is more useful than just using nx.topological_sort (or nx.lexicographical_topological_sort) directly?

fuglede
  • 17,388
  • 2
  • 54
  • 99
  • In my case, I need to make requests to each component, with this algorithm I can do multiple requests in one time. Thanks. – Taras Protsenko Jun 30 '19 at 07:59
  • I see. Sounds a bit like a simple queue might be a better fit. For example, suppose you had an additional request "8" and an additional edge (5, 8). Then with the approach at hand, you would get distinct groups `[7]` and `[8]` while, if I understand you correctly, the requests corresponding to those two could in fact be carried out in parallel. – fuglede Jun 30 '19 at 08:35
3

Try out NetworkX's topological generations

DG = nx.DiGraph([(1,2), (2,4), (3,4), (4,5), (4,6), (6,7)])
[sorted(generation) for generation in nx.topological_generations(DG)]
[[1, 3], [2], [4], [5, 6], [7]]
2

topological sort using DFS will solve the problem

from collections import defaultdict, deque


class Graph:
    def __init__(self, directed=False, nodes=None, edges=None):
        self.graph = defaultdict(list)
        self.directed = directed
        self.add_nodes(nodes)
        self.add_edges(edges)

    @property
    def nodes(self):
        if not self.directed:
            return list(self.graph.keys())
        elif self.directed:
            nodes = set()
            nodes.update(self.graph.keys())
            for node in self.graph.keys():
                for neighbor in self.graph[node]:
                    nodes.add(neighbor)
            return list(nodes)

    def add_node(self, node):
        if node not in self.nodes:
            self.graph[node] = list()

    def add_nodes(self, nodes):
        if nodes is None:
            return None
        for node in nodes:
            self.add_node(node)

    def remove_node(self, node):
        try:
            del self.graph[node]
        except KeyError:
            print(f'{node} is not in graph')
            return None
        # remove parallel edges, but keep other parallel edges untouched
        for source, adj_list in self.graph.items():
            empty = True
            while empty:
                if node in adj_list:
                    adj_list.remove(node)
                else:
                    empty = False

    def remove_nodes(self, nodes):
        for node in nodes:
            self.remove_node(node)

    @property
    def edges(self):
        edges = list()
        for source, neighbors in self.graph.items():
            for neighbor in neighbors:
                edges.append((source, neighbor))
        return edges

    def add_edge(self, edge):
        node1, node2 = edge
        self.graph[node1].append(node2)
        if not self.directed:
            self.graph[node2].append(node1)

    def add_edges(self, edges):
        if edges is None:
            return None
        for edge in edges:
            self.add_edge(edge)

    def remove_edge(self, edge):
        self.remove_nodes(edge)

    def remove_edges(self, edges):
        for edge in edges:
            self.remove_nodes(edge)

    def topological_util(self, node, visited, label):
        visited[node] = True
        for edge in self.graph[node]:
            if not visited[edge]:
                self.topological_util(edge, visited, label)
        label.appendleft(node)

    def topological_sort(self):
        visited = dict.fromkeys(self.nodes, False)
        # store all nodes in topological order, the index is the position
        label = deque()
        for node in self.nodes:
            if not visited[node]:
                self.topological_util(node, visited, label)
        return label

The class Graph and topological sort implemented in python. Hope this will help you.

Yossarian42
  • 1,950
  • 17
  • 14
1

Your problem is solved by what is known as a "topological sort." Such a sort determines dependencies in a directed acyclic graph. I recently adapted a solution to this problem. Here is a simple python application that demonstrates its behavior:

# adapted from https://gist.github.com/kachayev/5910538
from collections import deque
GRAY, BLACK = 0, 1


def topological(graph):
    order, enter, state = deque(), set(graph), {}

    dot = "digraph X {\r\n"
    for item in graph.keys():
        dep = graph[item]
        for d in dep:
            dot += item + " -> " + str(d) + '\r\n'
    dot += "}"
    print(dot)

    def dfs(node):
        state[node] = GRAY
        for k in graph.get(node, ()):
            sk = state.get(k, None)
            if sk == GRAY:
                raise ValueError("cycle")
            if sk == BLACK:
                continue
            enter.discard(k)
            dfs(k)
        #order.appendleft(node)  # show highest to lowest
        order.append(node)  # show lowest to highest
        state[node] = BLACK
    while enter:
        dfs(enter.pop())
    return order


def main():
    graph = {
        '1': ['2'],
        '2': ['4'],
        '3': ['4'],
        '4': ['5', '6'],
        '6': ['7'],
    }
    try:
        print(topological(graph))
    except ValueError:
        print("Cycle!")


if __name__ == "__main__":
    main()

The output is

deque(['5', '7', '6', '4', '2', '1', '3'])

Please note also that my code creates a DOT 'digraph' string for visualization in GraphVis. You can safely leave that out once you gain confidence in the algorithm. You can reverse the commented and uncommented append lines near the end to get the major nodes first. Also note, this solution determines a solution that satisfies the graph. There may be others, and it doesn't determine the order as you required, but the graph satisfaction is one correct answer.

TomServo
  • 7,248
  • 5
  • 30
  • 47
0

I wrote this code which solves my problem, but maybe there is some more elegant solution?

def process_cursor(G, passed, node_id):
    if set(G.predecessors(node_id)).issubset(passed):
        return True, list(G.successors(node_id))
    return False, None


def get_all_roots(G: nx.DiGraph):
    for node_id in G.nodes:
        if not any(G.predecessors(node_id)):
            yield node_id


def order_components(G: nx.DiGraph):
    nodes_amount = len(G.nodes)
    cursors = set(get_all_roots(G))
    passed = []
    iterations = 0
    while len(passed) != nodes_amount:
        iterations += 1
        if iterations > nodes_amount:
            raise ValueError("Could not create sequence of graph.")
        step = []
        next_cursors = []
        step_passed = []
        for node_id in cursors:
            can_process, tmp_cursors = process_cursor(G, passed, node_id)
            if can_process:
                next_cursors.extend(tmp_cursors)
                step_passed.append(node_id)
                node_data = G.nodes[node_id]
                step.append(node_id)
        cursors = set(next_cursors)
        passed.extend(step_passed)
        if step:
            yield step
    yield append
Taras Protsenko
  • 545
  • 1
  • 6
  • 15