1

I have a tree annotated at genus level (ie each leaf has a name) and I want to propagate the color of the leaves in the branches/edges as long as the children have the same genus, like in this plot:

enter image description here

Source

My tree is here (sorry, dput doesn't work...) and he looks like that:

library(ggraph)
library(tidygraph)
load("tree_v3")

TBL %>% activate(nodes) %>% as_tibble
# A tibble: 50 x 2
    leaf      Genus
   <lgl>     <fctr>
 1 FALSE         NA
 2  TRUE Klebsiella
 3  TRUE Klebsiella
 4 FALSE         NA
 5  TRUE Klebsiella
 6  TRUE Klebsiella
 7 FALSE         NA
 8 FALSE         NA
 9  TRUE Klebsiella
10 FALSE         NA
# ... with 40 more rows

I can print the tree with this code but as you can see, the edge colors stay near the leaves.

TBL %>%
  ggraph('dendrogram') + 
  theme_bw() +
  geom_edge_diagonal2(aes(color = node.Genus)) +
  scale_edge_color_discrete(guide = FALSE) +
  geom_node_point(aes(filter = leaf, color = Genus), size = 2)

enter image description here

There is a code in the section Mapping over searches on this blog post but it doesn't work on my data and I don't understand why...

TBL2 <- TBL %>%
  activate(nodes) %>%
  mutate(Genus = map_bfs_back_chr(node_is_root(), .f = function(node, path, ...) {
    nodes <- .N()
    if (nodes$leaf[node]) return(nodes$Genus[node])
    if (anyNA(unlist(path$result))) return(NA_character_)
    path$result[[1]]
  }))

Error in mutate_impl(.data, dots) : Evaluation error: Cannot coerce values to character(1).

EDIT after Marco Sandri answer

With mutate(Genus = as.character(Genus)) there is no more error message but the Genus doesn't propagate correctly. For instance see the third and fourth nodes starting from the right: the parent is supposed to be NA... (note that it doesn't work either in the blog post plot).

enter image description here

p0bs
  • 1,004
  • 2
  • 15
  • 22
abichat
  • 2,317
  • 2
  • 21
  • 39

1 Answers1

2

Genus in TBL is a factor:

str(TBL %>% activate(nodes) %>% as_tibble)

# Classes ‘tbl_df’, ‘tbl’ and 'data.frame':       50 obs. of  2 variables:
# $ leaf : logi  FALSE TRUE TRUE FALSE TRUE TRUE ...
# $ Genus: Factor w/ 10 levels "","Citrobacter",..: NA 6 6 NA 6 6 NA NA 6 NA ...

but should be a character.
After converting Genus from factor to character, the code works.

TBL2 <- TBL %>%
  activate(nodes) %>% 
  mutate(Genus = as.character(Genus)) %>%
    mutate(Species = map_bfs_back_chr(node_is_root(), .f = function(node, path, ...) {
        nodes <- .N()
        if (nodes$leaf[node]) return(nodes$Genus[node])
        if (anyNA(unlist(path$result))) return(NA_character_)
        path$result[[1]]
    }))
Marco Sandri
  • 23,289
  • 7
  • 54
  • 58
  • Thank you for your help. However, the code doesn't work very well, as I said in my edit. – abichat Aug 24 '17 at 06:19
  • @MrSnake This is a different and new question. Now the code works but it does not give you the expected result. I suggest you to close (not to delete) this post and to open a new one where you describe in details your problem. – Marco Sandri Aug 24 '17 at 13:00