I am working with a large network with thousands of nodes and edges to consider. A reprex of the network can be found in a previous question here Number of Connected Nodes in a dendrogram
However, when calculating the number of nodes within the network, I ran into a problem when trying to calculate the number of nodes that add together to lead to the next level up. For example,
library(tidygraph)
library(ggraph)
library(tidyverse)
parent_child <- tribble(
~parent, ~child,
"a", "b",
"b", "c",
"b", "d",
"d", "e",
"d", "f",
"d", "g",
"g", "z"
)
# converted to a dendrogram ------------
parent_child %>%
as_tbl_graph() %>%
ggraph(layout = "dendrogram") +
geom_node_point() +
geom_node_text(aes(label = name),
vjust = -1,
hjust = -1) +
geom_edge_elbow()
# Table of calculations ----------------------
parent_child %>%
as_tbl_graph() %>%
activate(nodes) %>%
mutate(n_community_out = local_size(order = graph_size(),
mode = "out",
mindist = 0)) %>%
as_tibble()
# Final Output Table -----------------------
# A tibble: 8 x 2
name n_community_out
<chr> <dbl>
1 a 8
2 b 7
3 d 5
4 g 2
5 c 1
6 e 1
7 f 1
8 z 1
The table above shows the number of connected nodes out from a starting node. However, why do certain levels not add up to the next level? (node d + c != node b) I've been trying to explain this to colleagues, but cannot adequately explain what the network is counting and why adding up the node connections from on position to the next does not lead to the next higher level.
This problem is exacerbated within a network with thousands of nodes, and is difficult to display. Anyway, does anyone know how to explain why nodes connections do not add up to the next level? Any help is greatly appreciated.