5

There is a directed graph (which might contain cycles), and each node has a value on it, how could we get the sum of reachable value for each node. For example, in the following graph:

directed graph

the reachable sum for node 1 is: 2 + 3 + 4 + 5 + 6 + 7 = 27

the reachable sum for node 2 is: 4 + 5 + 6 + 7 = 22

.....

My solution: To get the sum for all nodes, I think the time complexity is O(n + m), the n is the number of nodes, and m stands for the number of edges. DFS should be used,for each node we should use a method recursively to find its sub node, and save the sum of sub node when finishing the calculation for it, so that in the future we don't need to calculate it again. A set is needed to be created for each node to avoid endless calculation caused by loop.

Does it work? I don't think it is elegant enough, especially many sets have to be created. Is there any better solution? Thanks.

user3386109
  • 34,287
  • 7
  • 49
  • 68
user8142520
  • 751
  • 1
  • 6
  • 20
  • 1
    Possible duplicate of [Reachability Count for all nodes in a Directed Acyclic Graph](https://stackoverflow.com/questions/35829819/reachability-count-for-all-nodes-in-a-directed-acyclic-graph) – Prune Jan 22 '18 at 19:51
  • Thanks @Prune but I am talking about directed graph rather than DAG. – user8142520 Jan 22 '18 at 19:54
  • 1
    In that case, the algorithm is very similar; you simply keep a list of nodes visited, to filter what you add to the "not done" list. – Prune Jan 22 '18 at 19:58

2 Answers2

3

This can be done by first finding Strongly Connected Components (SCC), which can be done in O(|V|+|E|). Then, build a new graph, G', for the SCCs (each SCC is a node in the graph), where each node has value which is the sum of the nodes in that SCC.

Formally,

G' = (V',E')
Where V' = {U1, U2, ..., Uk | U_i is a SCC of the graph G}
E' = {(U_i,U_j) | there is node u_i in U_i and u_j in U_j such that (u_i,u_j) is in E }

Then, this graph (G') is a DAG, and the question becomes simpler, and seems to be a variant of question linked in comments.

EDIT previous answer (striked out) is a mistake from this point, editing with a new answer. Sorry about that.

Now, a DFS can be used from each node to find the sum of values:

DFS(v):
  if v.visited:
    return 0
  if v is leaf:
    return v.value
  v.visited = true
  return sum([DFS(u) for u in v.children])
  • This is O(V^2 + VE) worst vase, but since the graph has less nodes, V and E are now significantly lower.
  • Some local optimizations can be made, for example, if a node has a single child, you can reuse the pre-calculated value and not apply DFS on the child again, since there is no fear of counting twice in this case.

A DP solution for this problem (DAG) can be:

D[i] = value(i) + sum {D[j] | (i,j) is an edge in G' }

This can be calculated in linear time (after topological sort of the DAG).

Pseudo code:

  1. Find SCCs
  2. Build G'
  3. Topological sort G'
  4. Find D[i] for each node in G'
  5. apply value for all node u_i in U_i, for each U_i.

Total time is O(|V|+|E|).

amit
  • 175,853
  • 27
  • 231
  • 333
  • 4
    Doesn't this DP solution double count nodes that have multiple paths to them? – luqui Mar 31 '18 at 10:28
  • Is there any difference with this and a transitive closure of the graph? -- that can't be computed in this time AFIK, but for the life of me, I just can't figure what that has that this doesn't. – user2268997 Jun 05 '19 at 01:00
  • @luqui Yes, you're right. This will count those nodes twice in this case. – Moustafa Saleh Nov 10 '19 at 04:12
  • Thanks for the comments. It somehow skipped my attention to fix it. Here is a small fix. Much less than I hoped for. I still hope there is a way to re-use pre-calculated data, but am not sure how at this point. I still think using SCC beforehand is probably a good idea. Since SCC is linear, it is enough to have saved constant number of nodes, and this will pay off when running quadric followup algorithm. – amit May 11 '20 at 08:24
2

You can use DFS or BFS algorithms for solving Your problem. Both have complexity O(V + E)

You dont have to count all values for all nodes. And you dont need recursion. Just make something like this.

Typically DFS looks like this.

unmark all vertices
choose some starting vertex x
mark x
list L = x
while L nonempty
    choose some vertex v from front of list
    visit v
    for each unmarked neighbor w
        mark w
        add it to end of list

In Your case You have to add some lines

unmark all vertices
choose some starting vertex x
mark x
list L = x
float sum = 0
while L nonempty
    choose some vertex v from front of list
    visit v
    sum += v->value
    for each unmarked neighbor w
        mark w
        add it to end of list
Gor
  • 2,808
  • 6
  • 25
  • 46
  • 5
    Note that this solution is `O(|V|+|E|)` **for a single node.** If you want to find the sum of all reachable nodes from each vertex, you need to run it for each node again, giving total time of `O(|V|^2 + |E||V|)` (which is what the question asks for). – amit Jan 23 '18 at 11:30