Computing nearest vertex neighbors in a DAG completed by transitive-closure

Question

Consider a directed graph, like this:

enter image description here

Where, (A) initially, the solid black edges are asserted:

0 &rightarrow; {1,3}
1 &rightarrow; {2}
3 &rightarrow; {4}
4 &rightarrow; {2}

Then (B) the transitive closure has then been computed to add the following (dashed) edges:

0 &rightarrow; {2,4}
3 &rightarrow; {2}

For any vertex in this final graph, how can I efficiently compute the 'immediate' neighbors accessible by some edge which aren't accessible by a different, longer path? The output I want is shown in (A). I am not distinguishing between edges which were asserted (bold) or inferred (dashed).

Is there a well-known name to this problem, and is there a straightforward way of achieving this with JGraphT?

Thoughts:

Perhaps this is possible by using a topological sort of the vertices, e.g. TS = [0,1,3,4,2].

for(i=0, i<TS.len; i++) {
  var v0 = TS[i]
  for (j=i+1; i<TS.len; j++) {
    var v1 = TS[j]
    if (edge exists v0 -> v1) {
      var otherPath = false
      for (k=i+1; k<j; k++) {
        var v2 = TS[k]
        if (path exists v0 -> v2 && path exists v2 -> v1) {
          otherPath = true 
          break
        }
      }
      if (!otherPath) {
        record (v0 -> v1)
      }
    }
  } 
}

Basically, I'm thinking that (v0 &rightarrow; v1) is a solution when no other longer path (v0 &rightarrow; ... v2 ... &rightarrow; v1) exists where v0 > v2 > v1 in the topological sort. Does this look correct, and/or is there a more efficient way?

Are the transitive edges necessarily already added when you do the work? Obviously for the transitive edges this won't be true, so it would be easier to do the calc without them. — Evan Knowles, May 28 '14 at 10:26
Yes, that's the problem! The graph includes all edges which can be inferred by it's transitive closure. — , May 28 '14 at 11:53
What if the original graph happened to have that direct connection as well as the longer one, so the direct connection wasn't a transitive edge? — Evan Knowles, May 28 '14 at 12:42
I'm not distinguishing between edges which are asserted, or those which were inferred by the transitive closure. I'm just dealing with a graph that has a certain shape -- one completed by transitive closure -- though perhaps introducing that step was confusing. For clarity, I've edited my post to make my question **bold**. — , May 28 '14 at 22:37
I'm not sure about your approach with the topological sorting. How would you check whether a "path exists"? **IF** you can do this efficiently, then you could just do for each edge: 1. remove the edge, 2. check whether there still is a path between its endpoints, 3. if there is a path: store the edge in the result set, 4. re-insert the edge. Maybe I misunderstood something there, but the checks for path existence may be expensive anyhow — Marco13, May 28 '14 at 23:02
Path checking between two vertices can be performed over the original graph using [Dijkstra's algorithm](http://en.wikipedia.org/wiki/Dijkstra's_algorithm) in PTIME. I'm not sure about my topological sort approach either, which is why I'm after a solution that is correct and (hopefully) more efficient! — , May 28 '14 at 23:04
Well, to check for the *existence* of a path you don't even need a *shortest* path algorithm. A simple BFS would do. But you have to do this for each edge, that's why I suspected that you might be looking for something more efficient than that... — Marco13, May 28 '14 at 23:13
Non-existence of a path using naive BFS may be expensive, as it may search large portions of the graph which cannot contain a solution. However, I'm thinking that the BFS can be limited by the topological order, in that no path `v0 -> ... -> vN` where `v1 > vN` in the order need be further considered. — , May 29 '14 at 00:15

Marco13 · Accepted Answer · 2014-05-29T13:01:36.407

The approach mentioned in the comments, implemented with JGraphT:

It simply iterates over all edges, removes each edge temporarily, and checks whether the vertices of the edge are still connected (using a simple Breadth-First-Search). If they are no longer connected, then the edge is added to the result set (before it is re-inserted into the graph).

This is rather trivial, so I assume that there is a more sophisticated solution. Particularly the checks for the existence of a path (even with a simple BFS) might become prohibitively expensive for "larger" graphs.

EDIT: Extended the code with a first attempt (!) of implementing the topological constraint that was mentioned in the original question. It basically follows the same concept as the simple approach: For each edge, it checks whether the vertices of the edge are still connected if the edge was removed. However, this check whether the vertices are connected is not done with a simple BFS, but with a "constrained" BFS. This BFS skips all vertices whose topological index is greater than the topological index of the end vertex of the edge.

Although this delivers the same result as the simple approach, the topological sort and the implied constraint are somewhat brain-twisting and its feasibility should be analyzed in a more formal way. This means: I am NOT sure whether the result is correct in every case.

IF the result is correct, it could indeed be more efficient. The program prints the number of vertices that are scanned during the simple BFS and during the constrained BFS. The number of vertices for the constrained BFS is lower, and this advantage should become greater when the graph becomes larger: The simple BFS will basically always have have to scan all edges, resulting in the worst case of O(e), and a complexity of O(e*e) for the whole algorithm. The complexity of the topological sorting depends on the structure of the graph, but should be linear for the graphs in question. However, I did not explicitly analyze what the complexity of the constrained BFS will be.

import java.util.ArrayDeque;
import java.util.LinkedHashMap;
import java.util.LinkedHashSet;
import java.util.LinkedList;
import java.util.Map;
import java.util.Queue;
import java.util.Set;

import org.jgrapht.DirectedGraph;
import org.jgrapht.Graph;
import org.jgrapht.graph.DefaultEdge;
import org.jgrapht.graph.SimpleDirectedGraph;
import org.jgrapht.traverse.BreadthFirstIterator;
import org.jgrapht.traverse.TopologicalOrderIterator;

public class TransitiveGraphTest
{
    public static void main(String[] args)
    {
        DirectedGraph<String, DefaultEdge> g = 
            createTestGraph();

        Set<DefaultEdge> resultSimple = compute(g);
        Set<DefaultEdge> resultTopological = computeTopological(g);

        System.out.println("Result simple      "+resultSimple);
        System.out.println("Visited            "+debugCounterSimple);
        System.out.println("Result topological "+resultTopological);
        System.out.println("Visited            "+debugCounterTopological);
    }

    private static int debugCounterSimple = 0;
    private static int debugCounterTopological = 0;


    //========================================================================
    // Simple approach: For each edge, check with a BFS whether its vertices
    // are still connected after removing the edge

    private static <V, E> Set<E> compute(DirectedGraph<V, E> g)
    {
        Set<E> result = new LinkedHashSet<E>();
        Set<E> edgeSet = new LinkedHashSet<E>(g.edgeSet());
        for (E e : edgeSet)
        {
            V v0 = g.getEdgeSource(e);
            V v1 = g.getEdgeTarget(e);
            g.removeEdge(e);
            if (!connected(g, v0, v1))
            {
                result.add(e);
            }
            g.addEdge(v0, v1);
        }
        return result;
    }

    private static <V, E> boolean connected(Graph<V, E> g, V v0, V v1)
    {
        BreadthFirstIterator<V, E> i = 
            new BreadthFirstIterator<V, E>(g, v0);
        while (i.hasNext())
        {
            V n = i.next();
            debugCounterSimple++;
            if (n.equals(v1))
            {
                return true;
            }
        }
        return false;

    }


    //========================================================================
    // Topological approach: For each edge, check whether its vertices
    // are still connected after removing the edge, using a BFS that
    // is "constrained", meaning that it does not traverse past 
    // vertices whose topological index is greater than the end
    // vertex of the edge

    private static <V, E> Set<E> computeTopological(DirectedGraph<V, E> g)
    {
        Map<V, Integer> indices = computeTopologicalIndices(g);
        Set<E> result = new LinkedHashSet<E>();
        Set<E> edgeSet = new LinkedHashSet<E>(g.edgeSet());
        for (E e : edgeSet)
        {
            V v0 = g.getEdgeSource(e);
            V v1 = g.getEdgeTarget(e);
            boolean constrainedConnected =
                constrainedConnected(g, v0, v1, indices);
            if (!constrainedConnected)
            {
                result.add(e);
            }
        }        
        return result;
    }

    private static <V, E> Map<V, Integer> computeTopologicalIndices(
        DirectedGraph<V, E> g)
    {
        Queue<V> q = new ArrayDeque<V>();
        TopologicalOrderIterator<V, E> i = 
            new TopologicalOrderIterator<V, E>(g, q);
        Map<V, Integer> indices = new LinkedHashMap<V, Integer>();
        int index = 0;
        while (i.hasNext())
        {
            V v = i.next();
            indices.put(v, index);
            index++;
        }
        return indices;
    }


    private static <V, E> boolean constrainedConnected(
        DirectedGraph<V, E> g, V v0, V v1, Map<V, Integer> indices)
    {
        Integer indexV1 = indices.get(v1);
        Set<V> visited = new LinkedHashSet<V>();
        Queue<V> q = new LinkedList<V>();
        q.add(v0);
        while (!q.isEmpty())
        {
            V v = q.remove();
            debugCounterTopological++;
            if (v.equals(v1))
            {
                return true;
            }
            Set<E> outs = g.outgoingEdgesOf(v);
            for (E out : outs)
            {
                V ev0 = g.getEdgeSource(out);
                V ev1 = g.getEdgeTarget(out);
                if (ev0.equals(v0) && ev1.equals(v1))
                {
                    continue;
                }
                V n = g.getEdgeTarget(out);
                if (visited.contains(n))
                {
                    continue;
                }
                Integer indexN = indices.get(n);
                if (indexN <= indexV1)
                {
                    q.add(n);
                }
                visited.add(n);
            }
        }
        return false;
    }

    private static DirectedGraph<String, DefaultEdge> createTestGraph()
    {
        DirectedGraph<String, DefaultEdge> g =
            new SimpleDirectedGraph<String, DefaultEdge>(DefaultEdge.class);

        String v0 = "0";
        String v1 = "1";
        String v2 = "2";
        String v3 = "3";
        String v4 = "4";

        g.addVertex(v0);
        g.addVertex(v1);
        g.addVertex(v2);
        g.addVertex(v3);
        g.addVertex(v4);

        g.addEdge(v0, v1);
        g.addEdge(v0, v3);
        g.addEdge(v1, v2);
        g.addEdge(v3, v4);
        g.addEdge(v4, v2);

        g.addEdge(v0, v2);
        g.addEdge(v0, v4);
        g.addEdge(v3, v2);

        return g;
    }



}

+1. I suspect this solution might be more efficient than the one I've proposed. In contrast, it uses the edges as the starting point instead of testing combinations of vertices. Secondly, checking for the existence of a longer path between `v0` and `v1` even using BFS should still be cheap in general, because the transitive closure between `v_0 -> ... -> v_1` contains edges between every interior vertex `v_i` and `v_j` for `i < j`. I wonder if the topological sort can still be used to _limit_ the BFS though, such that we don't search _past_ vertices which are `> v1` in the order. Many thanks. — , May 28 '14 at 23:42
@sharky The approach sounds reasonable for me, although I find it difficult to analyze whether it 1. will always produce the correct result and 2. what its running time is. However, I edited the answer, extending it with a first attempt to implement this approach (and compare it to the "simple" one). — Marco13, May 29 '14 at 13:03
A topological sort induces a partial order over vertices such that their edge directions are always one-way (e.g., right for `[v0,v1,...,vN]`). So, if we're searching for a path from `v0 -> vi`, we can't possibly find one from `v0 -> ... -> vj` where `vi < vj` in the partial order, because the order indicates there is no path back from `vj` to `vi`. Therefore, limiting the BFS in this way is correct. However, the computational complexity is harder to analyse: the O(|V|+|E|) complexity of computing the order is outweighed by the complexity of path-finding between vertex pairs. — , May 29 '14 at 23:32

Computing nearest vertex neighbors in a DAG completed by transitive-closure

1 Answers1