1

Graph randomly generated with networkx. I want to retrieve from this graph the strongly connected components.

Why do I get only one large scc using the built-in networkx function?

n = random.randint(10, 500)
p = random.randint(1, 9) / 10
graph = nx.gnp_random_graph(n, p, seed=None, directed=True)
print("Nodes {} Edge density {}".format(n, p))
comp = nx.strongly_connected_components(graph)
sccs = list(nx.strongly_connected_components(graph))
sccs.reverse()
print(sccs)

Nodes 127 Edge density 0.9
[{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126}]

Even if a force density of edge to 0.1, I get the same SCC.

graph = nx.gnp_random_graph(n, 0.1, seed=None, directed=True)
print("Nodes {} Edge density {}".format(n, p))
comp = nx.strongly_connected_components(graph)
sccs = list(nx.strongly_connected_components(graph))
sccs.reverse()
print(sccs)

Nodes 484 Edge density 0.1
<generator object strongly_connected_component_subgraphs at 0x000001B8019A9468>
[{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483}]
Guido
  • 441
  • 3
  • 22
  • What do you mean "the single nodes as scc"? – xyzjayne Jul 09 '18 at 22:09
  • 1
    Note that the list has only one item. You have only one SCC that contains all nodes, which is totally expected, given the high requested density. – DYZ Jul 09 '18 at 22:10
  • @DyZ I have added the results of a graph with low density, because I get everytime of the execution the same result unfortunately. – Guido Jul 09 '18 at 22:52
  • @xyzjayne fixed, they weren't SCCs of single node like in density = 0, but one large SCC of all nodes. – Guido Jul 09 '18 at 22:53

1 Answers1

1

DyZ already summarized the issue.

You callwd networkx to make you a random graph with 127 nodes, with about 90% of the possible edges drawn in. An average node of this graph will have about 114 edges. Finding a single node that is disconnected enough to be separable is a chance that is astronomically small; finding a cluster of such nodes is even harder.

Yes, you can construct graphs with (relatively) thin connections, but at this proportion, any scc algorithm worth its vectorizations will keep the entire graph in one large component. Perhaps some spectral clustering algorithm could identify density and gaps that would get you the sort of division you seem to seek.

Prune
  • 76,765
  • 14
  • 60
  • 81
  • Thank you for the reply, I don't need other kind of division. I just want to be sure that it is really working, because at every iteration I always get the SCC with the entire set of node as SCC. Is this reasonable even with density 0.1? – Guido Jul 09 '18 at 22:55
  • I would think so. At density 0.1, you still have an average of almost 13 edges from each node. If you want to isolate some nodes, you'll need either a contingent randomization, or a somewhat lower density -- and be ready for multiple isolated nodes (single-node SCCs) in that case. – Prune Jul 09 '18 at 23:29
  • 1
    Density 0.1 would yield a lot of components for a very small graph. But here, with over 120 nodes, you'd be saying that one tenth of all possible edges exist. So the average is over 12 edges out (and 12 in). The expected number of paths of length 2 from a node is over 12^2. Imagine that you remove one node `u` and find a strongly connected component. What is the probability that there is no path from that component to `u` and from `u` to that component? – Joel Jul 09 '18 at 23:30
  • to add a bit more - you'll need to get the average degree down around 2 or so to start seeing multiple components. – Joel Jul 10 '18 at 04:43
  • Thank you all, I got it and now I have the expected behavior lowering the degree indeed – Guido Jul 11 '18 at 09:34