0

I would like to color the nodes in a tree according to a categorical variable with the help of ggtree. However, I am receiving the error "Must request at least one colour from a hue palette." I have constructed an example with the example tree from a similar question you can find here.

#I read my tree
tree.1 <- read.tree(text="(spec1:2.2,((spec2:1.8,(spec9:1.4,(spec3:1.3,spec5:1.3):0.1):0.4):0.2,(spec8:1.7,(spec6:1.5,(spec7:1,spec4:1):0.5):0.2):0.3):0.2);")

#I create some node labels
tree.1$node.label <- rep(100, 8)

#I create a plot with labeled tips and nodes
tree_plot <- ggtree(tree.1, branch.length = "none", ladderize = TRUE) + geom_tiplab() + geom_nodelab()

#I create a categorical variable to color the nodes
category_nodes <- c(rep("cat1", 4), rep("cat2", 4))
category_nodes_df <- data.frame(c(1:8), category_nodes)

#I try to use this variable to color the nodes
tree_plot %<+% category_nodes_df + geom_nodepoint(aes(fill=category_nodes))

I feel the process to achieve what I want should be (potentially) quite straightforward. However, I do not understand what is causing the error. Accordingly, I have tried a few things, but without much success. I have tried converting category_nodes to a factor with as.factor. (However, I believe ggplot2 converts characters to factors routinely, so this did not make a difference and I did not really expect it to). In my actual code I had a previous error complaining about the length of the categorical vector. It only contained entries for internal nodes, but it was requiring a vector of a length corresponding to the number of internal nodes + tips (In my case: Aesthetics must be either length 1 or the same as the data (259)). So, I added an appropriate number of NAs to the end of the vector. However, if I implement this for the example, nothing changes.

category_nodes <- c(rep("cat1", 4), rep("cat2", 4), rep(NA, 9))
category_nodes_df <- data.frame(c(1:17), category_nodes)

category_nodes_df$cat_factor <-  as.factor(category_nodes_df$category_nodes)

tree_plot %<+% category_nodes_df + geom_nodepoint(aes(fill=cat_factor))

1 Answers1

0

I have found a solution. However, I am still unsure why the original code does not work. What seems to function is:

#create categorical variable and convert to a vector containing color names
category_nodes <- c(rep("cat1", 4), rep("cat2", 4))
category_nodes_col <- as.factor(category_nodes)
levels(category_nodes_col) <- c("green", "blue")

#plot
ggtree(tree.1, branch.length = "none", ladderize = TRUE) + geom_tiplab() + geom_nodelab() + geom_nodepoint(color = category_nodes_col)