1

I have a non-cyclical graph that can be considered a tree. Here's a simplified example:

library(tidygraph)
create_tree(20,2, directed = TRUE, mode="in") %>% plot

is

The real-life example can be a little more complicated, as I may have multiple paths from leaves to the root (all of them directed non-cyclical).

I want to simplify the graph by removing intermediate nodes as follows:

K=0

In most extreme case (lets call it "k=0" simplification) I would enumerate all leaves, assure that they are connected to the root through depth-first search and then remove all intermediate connections, effectively linking every leaf to the root.

K=-1

The next level simplification (say "k=-1") I want to start with nodes that have at least one leaf child and repeat the same procedure. After simplification, all intermediate nodes will be removed:

data.frame(from=c(5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20),
           to = c(1,1,1,1,1, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9,10)) %>% 
  as_tbl_graph() %>% plot

want

K=-2

The next step of simplification would not make sense for this graph, because no edges would be modified and no nodes would be removed.

How do I code it up using igraph/tidygraph in R?

p0bs
  • 1,004
  • 2
  • 15
  • 22
dmi3kno
  • 2,943
  • 17
  • 31

1 Answers1

0

The most important part of solution is being able to enumerate nodes from leaves back to the root effectively measuring "distance to nearest leaf"

Using the same example as above, lets add node names (create_tree() does not make nodes, for some strange reason):

library(tidygraph)

graph <- create_tree(20,2, directed = TRUE, mode="in") %>% 
  activate(nodes) %>% mutate(name=1:n())

We would need a helper function which would be able to measure "distance to leaf":

make_levels <- function(grdf){
  i <- 0
  repeate <- TRUE
  # create helper column
  grdf <- grdf %>% 
    mutate(leaf = node_is_leaf(),
           level=ifelse(leaf, i, NA))

  while(repeate){
  i <- i + 1
  index <- grdf %>% activate(edges) %>%
    mutate(from_leaf=.N()$leaf[from]) %>% 
    as_tibble() %>% filter(from_leaf) %>% pull(to)

  grdf <- grdf %>% activate(nodes) %>% 
    mutate(leaf = 1:n() %in% index,
           level=ifelse(leaf & is.na(level), i, level))

  repeate <- grdf %>% activate(nodes) %>% 
    as_tibble() %>% pull(level) %>% is.na() %>% any()
  }
  # remove helper column
  grdf %>% activate(nodes) %>% select(-leaf)    
}

After this, solution (for k=-1 above) should be easy:

graph %>% make_levels() %>% activate(edges) %>% 
  reroute(from = from, to = 1, subset=(to %in% which(.N()$level==2))) %>% 
  activate(nodes) %>% filter(level!=2) %>% plot()

Which produces:

done

dmi3kno
  • 2,943
  • 17
  • 31