I have a phylogenetic tree in .tre
format and accompanying dataset. The exact form of the tree does not matter, it is just a random phylogenetic tree. The dataset has two columns: names and colours.
When plotting such tree, I would very likely add coloured points (two different colours) to the tree from that accompanying dataset. Problem is that when I am using following piece of code:
ggtree(RANDOMTREE) + geom_tippoint(pch=16, col=RANDOMDATA$color) + geom_tiplab(offset=0.1)
it colours the points but colours have the order they have in the accompanying dataset, of course.
But I would like to match the colours based on the names of the species in a tree with the one in the dataset (they are same format, but different order). I did not figure that out, yet. Can you please help me with this?
Thank you very much.
Example code:
source("https://bioconductor.org/biocLite.R")
biocLite("ggtree")
library(ggtree)
tree<-read.tree(text="(spec1,((spec2,(spec9,(spec3,spec5))),spec8,(spec6,(spec7,spec4))));")
dataset1<-data.frame("name" = c("spec1","spec2","spec3","spec4","spec5","spec6","spec7","spec8","spec9"), "colour" = c("red","red","blue","red","red","blue","blue","red","blue"))
ggtree(tree) + geom_tiplab() + geom_tippoint(pch=16, col=as.factor(dataset1$colour))
What I get: wrongly labeled tree
What I would like to get: correctly labeled tree