I'm trying to identify strongly connected communities within large group (undirected weighted graph). Alternatively, identifying vertices causing connection of sub-groups (communities) that would be otherwise unrelated.
The problem is part of broader Databricks solution thus Spark GraphX and GraphFrames are the first choice for resolving it.
As you can see from attached picture, I need to find vertex "X" as a point where can be split big continuous group identified by connected componect algorithms (val result = g.connectedComponents.run())
Strongly connected components method (for directed graph only), Triangle counting, or LPA community detection algorithms are not suitable, even if all weights are same, e.g. 1.
Picture with point, where should be cut big group ST0
Similar logic is nice described in question "Cut in a Weighted Undirected Connected Graph", but as a mathematical expression only.
Thanks for any hint.
// Vertex DataFrame
val v = sqlContext.createDataFrame(List(
(1L, "A-1", 1), // "St-1"
(2L, "B-1", 1),
(3L, "C-1", 1),
(4L, "D-1", 1),
(5L, "G-2", 1), // "St-2"
(6L, "H-2", 1),
(7L, "I-2", 1),
(8L, "J-2", 1),
(9L, "K-2", 1),
(10L, "E-3", 1), // St-3
(11L, "F-3", 1),
(12L, "Z-3", 1),
(13L, "X-0", 1) // split point
)).toDF("id", "name", "myGrp")
// Edge DataFrame
val e = sqlContext.createDataFrame(List(
(1L, 2L, 1),
(1L, 3L, 1),
(1L, 4L, 1),
(1L, 13L, 5), // critical edge
(2L, 4L, 1),
(5L, 6L, 1),
(5L, 7L, 1),
(5L, 13L, 7), // critical edge
(6L, 9L, 1),
(6L, 8L, 1),
(7L, 8L, 1),
(12L, 10L, 1),
(12L, 11L, 1),
(12L, 13L, 9), // critical edge
(10L, 11L, 1)
)).toDF("src", "dst", "relationship")
val g = GraphFrame(v, e)