0

Consider a simple directed graph G = (V,E). A simple directed graph contains neither self-loops or multiple edges. Let us further assume that G consists of a single (weakly) connected component.

Given a subset V' of V, I would like to create another simple directed graph G' = (V',E') so that G' has an edge (u,v) if and only if there is a path (any path, not necessarily the shortest path) from u to v in G that does not contain any other vertex in V'. Clearly G' is not really a subgraph of G, but the problem seems similar. Does it have a standard solution, or do I have to roll my own?

A concrete example may make clearer what I have in mind: Given a BPMN diagram consisting only of exclusive gateways, parallel gateways, tasks, and the sequence flow between them, I would like to reduce this diagram to the raw control flow. That is, I want to take out all the tasks, so that my V' consists of just the gateways. Then I want to connect any two gateways that have a path between them in the original diagram that goes only through tasks, not through other gateways. Such a path, if it exists, may not be the only path, nor the shortest, between the two gateways in the original diagram. If there are several such paths (which will usually be the case, as every branch of an opening gateway will usually lead through several tasks to the same closing gateway), of course I want my reduced graph to contain only one edge between those gateways.

Sebastian
  • 315
  • 2
  • 12

1 Answers1

1

I would do this:

  1. Do a transitive closure of (V - V', E). Let's call it (V - V', Et).
  2. An edge (u,v) is in G' when and only when there exists an intermediate edge (ui,vi) in Et, where (u,ui) and (vi,v) edges are in E.

The complexities of step 1 is O(|V - V'|**3), of step 2 - O(|V'|**2).

I also have a feeling that Floyd–Warshall algorithm for transitive closure can be modified to do it all in one step. It will need an extra data field in dist (the distance or path existence matrix): "begins in V", "ends in V", and should exclude V from the intermediate paths.

Victor Sergienko
  • 13,115
  • 3
  • 57
  • 91
  • Thanks. Interleaving graph construction with transitive closure is a good idea. That should work in O(|E|)**2: Two nested for-loops over the edges, the first to identify edges originating in V', the second to follow the path in V-V' until I find an edge ending in V'. And since BPMN models are pretty sparse graphs O(|E|)**2 isn't too bad. – Sebastian Aug 27 '20 at 07:43