1

I'm dealing with rooted, directed, potentially cyclic graphs. Each vertex in the graph has a label, which might or might not be unique. Edges do not have labels. The graph has a designated root vertex from which every vertex is reachable. The order of the edges outgoing from a vertex is relevant.

For my purposes, a vertex is equal to another vertex if they share the same label, and if their outgoing edges are also considered equal (and are in the same order). Two edges are equal if they have the same direction and if the vertices at their corresponding ends are equal.

Because of the equality rules above, a graph can contain multiple "sections" that are effectively equal. For example, in the graph below, there are two isomorphic sections containing vertices with labels {1, 2, 3, 4}. The root of the graph is vertex 0.

graph
(source: graphonline.ru)

I need to be able to identify sections that are identical, and then remove all duplication, without changing the "meaning" of the graph (with regard to the equality rules above). Using the above example as input, I need to produce this:

graph
(source: graphonline.ru)

Is there a known way of doing this within polynomial time?

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Joshua Wise
  • 613
  • 4
  • 15
  • One first idea would be, to have a defined way of iterating through the graph from the root node (like with depth first search), and it would need to be a deterministic decision on which path the search-iteration is going. You would then record all nodes labels in a long e.g. string. In this string you can find all repeated sub-sequences of length two or more. these are your duplicate sections. After finding these sub-sequences you erase the duplicates and rebuild the graph. – Jan Aug 29 '19 at 18:55
  • This wouldn't work because of cycles in the graph. You'd enter an infinite loop trying to do a depth-first search. Also, if you remembered which nodes you've visited, and simply ignored nodes that have already been visited, then you'd be ignoring edges that could potentially differentiate two sections (edges that go to already-visited nodes). Maybe something like this could work if we also encoded the edges into the string, somehow. – Joshua Wise Aug 29 '19 at 19:42
  • The problem is that encoding edges into a string seems impossible, given that the edge's description depends on its outgoing vertex, and that vertex's description depends on more edges. – Joshua Wise Aug 29 '19 at 20:14
  • You could indeed do it with edges I think. Do a DFS from root, and identify all your TREE, BACK, EDGE and CROSS edges. Circles are not a problem at all. These edges can be sorted and compared. – Jan Aug 30 '19 at 05:35

1 Answers1

1

The solution that ended up working was to essentially run the recursive equality check against every pair of vertices with the same label.

  1. Let S = all pairs of vertices with the same label
  2. For each s in S:
    1. Compare the two vertices a and b in s by recursively comparing their children
    2. If they compare as equal, take all edges in the graph pointing to b, and point them to a instead
Joshua Wise
  • 613
  • 4
  • 15
  • One detail comes to mind. Did you take advantage of the fact that when you recursively compare the children of a pair of vertices *a* and *b*, you may identify other pairs of vertices that are identical ? In your example graph above, in the process of establishing that vertices (1) are "equal", you will recognize that the pairs of vertices (2), (3) and (4) are equal. Thus you do not need to check again for these pairs. – Patrick Sep 06 '19 at 01:17
  • 1
    You know, I had thought of this, but then I forgot to implement it. Thanks for pointing this out! – Joshua Wise Sep 09 '19 at 16:20