I have two phylogenetic trees which have the same topology (expect for branch lengths):
In R
using ape
:
t1 <- ape::read.tree(file="",text="(((HS:72,((CP:30,CL:30.289473923):32,RN:62):10):2,(CS:63,BS:63):11):5,LA:79);")
t2 <- ape::read.tree(file="",text="(((((CP:39,CL:39):29,RN:68):9,HS:77):5,(BS:63,CS:63):19):14,LA:96);")
> ape::all.equal.phylo(t1,t2,use.edge.length = F,use.tip.label = T)
[1] TRUE
I want to compute the mean branch lengths across the two but the problem is that although their topologies are identical the order at which their nodes are represented is not identical, and not all tree nodes are labeled tips so I don't think there's a simple join solution:
> head(tidytree::as_tibble(t1))
# A tibble: 6 x 4
parent node branch.length label
<int> <int> <dbl> <chr>
1 10 1 72 HS
2 12 2 30 CP
3 12 3 30.3 CL
4 11 4 62 RN
5 13 5 63 CS
6 13 6 63 BS
> tail(tidytree::as_tibble(t1))
# A tibble: 6 x 4
parent node branch.length label
<int> <int> <dbl> <chr>
1 8 8 NA NA
2 8 9 5 NA
3 9 10 2 NA
4 10 11 10 NA
5 11 12 32 NA
6 9 13 11 NA
> head(tidytree::as_tibble(t2))
# A tibble: 6 x 4
parent node branch.length label
<int> <int> <dbl> <chr>
1 12 1 39 CP
2 12 2 39 CL
3 11 3 68 RN
4 10 4 77 HS
5 13 5 63 BS
6 13 6 63 CS
> tail(tidytree::as_tibble(t2))
# A tibble: 6 x 4
parent node branch.length label
<int> <int> <dbl> <chr>
1 8 8 NA NA
2 8 9 14 NA
3 9 10 5 NA
4 10 11 9 NA
5 11 12 29 NA
6 9 13 19 NA
So it's not clear to me how I'd correspond between any pair of branch lengths in order to take their mean.
Any idea how to match them or reorder t2
according to t1
?
Supposedly phytools
' matchNodes
method is meant for that but it doesn't seem like it's getting it right:
phytools::matchNodes(t1, t2,method = "descendants")
tr1 tr2
[1,] 8 8
[2,] 9 9
[3,] 10 10
[4,] 11 11
[5,] 12 12
[6,] 13 13
At least I'd expect it to correspond the tips correctly, meaning:
dplyr::left_join(dplyr::filter(tidytree::as_tibble(t1),!is.na(label)) %>% dplyr::select(node,label) %>% dplyr::rename(t1.node=node),
+ dplyr::filter(tidytree::as_tibble(t2),!is.na(label)) %>% dplyr::select(node,label) %>% dplyr::rename(t2.node=node))
Joining, by = "label"
# A tibble: 7 x 3
t1.node label t2.node
<int> <chr> <int>
1 1 HS 4
2 2 CP 1
3 3 CL 2
4 4 RN 3
5 5 CS 6
6 6 BS 5
7 7 LA 7
But that's not happening.
Ultimately the information for matching is in these tree
tibble
s because they list the parents of each node, but practically using that information for matching the modes probably requires some recursive steps.