1

Is there a way to get the structure of an unrooted tree with the phyloseq::tree_layout() function?

Using tree_layout() will give you the coordinates of the nodes and segments that compose the tree plotted below. You can then easily redraw that tree.

library(ape)
library(phyloseq) # bioconductor
#
cat("(((Strix_aluco:4.2,Asio_otus:4.2):3.1,",
    "Athene_noctua:7.3):6.3,Tyto_alba:13.5);",
    file = "ex.tre", sep = "\n")
tree.owls <- read.tree("ex.tre")


par(mfrow=c(2,1))

#original tree
plot.phylo(tree.owls,type = 'p',main = 'plot.phylo')

#redraw structure with tree_layout object
tree.ly <- tree_layout(tree.owls)
plot(1,type='n',axes=FALSE, xlim = c(0,max(tree.ly$edgeDT$xright)),ylim = c(0,max(tree.ly$edgeDT$y)),main = 'tree_layout')
segments(x0=tree.ly$edgeDT$xleft,y0=tree.ly$edgeDT$y,x1=tree.ly$edgeDT$xright,y1=tree.ly$edgeDT$y)
segments(x0=tree.ly$vertDT$x,y0=tree.ly$vertDT$vmin,x1=tree.ly$vertDT$x,y1=tree.ly$vertDT$vmax)

But what if you want to redraw this tree: plot.phylo(tree.owls,type = 'u'). How would you do it?

  • Seems like `tree_layout` does not use the plotted phylogeny, but the `phylo` object only. So not sure you can get different coordinates for the rooted and unrooted tree. If you have a phylogeny and don't want to plot it with `ape` you can try `phytools` or `ggtree` (for a ggplot2 based alternative). If you are after the coordinates of nodes and tips of an already plotted phylogeny, you can try `get("last_plot.phylo",envir=.PlotPhyloEnv)` – teofil Sep 20 '19 at 21:24

1 Answers1

1
library(ape)
library(phyloseq)

cat(
  "(((Strix_aluco:4.2,Asio_otus:4.2):3.1,",
  "Athene_noctua:7.3):6.3,Tyto_alba:13.5);",
  file = "ex.tre",
  sep = "\n"
)
tree.owls <- read.tree("ex.tre")

tree_layout() works with the phylo object, not with the plotted phylogeny. So calling tree_layout() after plot.phylo(type='p') vs. plot.phylo(type='u') will result in the same object.

plot.phylo(tree.owls, type = 'p')
a <- tree_layout(tree.owls)

plot.phylo(tree.owls, type = 'u')
b <- tree_layout(tree.owls)

identical(a, b)
[1] TRUE

To get the coordinates of an already plotted phylogeny, we can use last_plot.phylo like so:

plot.phylo(tree.owls, type = 'p')
coords_phylogram <- get("last_plot.phylo",envir=.PlotPhyloEnv)

plot.phylo(tree.owls, type = 'u')
coords_unrooted <- get("last_plot.phylo",envir=.PlotPhyloEnv)

identical(coords_phylogram, coords_unrooted)
[1] FALSE

We can see that the coordinates are now different between the phylogram and unrooted network.

The output of get("last_plot.phylo",envir=.PlotPhyloEnv) is a named list, so we can extract particular element easily. For example, if we wanted to add some annotations at the coordinates of nodes and tips, we could do the following:

points(x = coords_unrooted$xx, y=coords_unrooted$yy, pch=21, bg="yellow", cex=3)

EDIT: create a layout data frame (like tree_layout) and plot an unrooted phylogeny

Preliminaries. phyloseq is no longer required as we are extracting the layout by hand.

library(ape)
library(dplyr)
tr <- "(((Strix_aluco:4.2,Asio_otus:4.2):3.1,Athene_noctua:7.3):6.3,Tyto_alba:13.5);"
tree.owls <- read.tree(text = tr)

A function to create tree_layouts()'s edgeDT and vertDT. To draw segments that are not only horizontal or vertical (like you would have in a cladogram or phylogram), we need another column for the y0 coordinates as these are now different from the y coordinates. This way we can plot slanted phylograms and unrooted trees.

Important note: Edge lengths as calculated by this function and stored in edgeDT are incorrect (in some cases)! This is because they are calculated by subtracting coordinates, and this obviously doesn't work if edges were plotted at an angle (as in a slanted phylogram or unrooted network). This does not matter for plotting, as the plotting function does not use edge.lengths. But keep it in mind.

layout_from_plot <- function(tree, type, drop_root=FALSE, use_vert=FALSE) {
  plot.phylo(x = tree, type = type)
  title("Plot to get coordinates")
  coords <- get("last_plot.phylo",envir=.PlotPhyloEnv)

  edgeDT <- tibble(
    xright = coords$xx[coords$edge[,2]],
    xleft = coords$xx[coords$edge[,1]],
    y = coords$yy[coords$edge[,2]],
    edge.length = xright-xleft,
    V1 = coords$edge[,1],
    V2 = coords$edge[,2]
  ) 

  edgeDT <- edgeDT %>%
    arrange(V2) %>%
    mutate(OTU = c(tree$tip.label, rep(NA_character_, coords$Nnode - 1))) %>%
    select(V1, V2, edge.length, OTU, xleft, xright, y)

  if (!use_vert) {
    edgeDT <- mutate(edgeDT, y0=y[V1-1])
  } else {
    edgeDT <- mutate(edgeDT, y0=y)
  }

  # this is a bit hacky, but it does work. 
  # double check that OTUs are in the right positions 
  # and branch lengths are correct.
  # ideally, you would unroot the tree, then plot the network  
  # to extract coordinates. See below.

  if (is.rooted(phy = tree)) {
    if (drop_root) {
      edgeDT <- edgeDT %>% 
        group_by(y0) %>% 
        mutate(xleft = ifelse(y == y0, xright, xleft)) %>% 
        mutate(xright = ifelse(y == y0, lead(xright), xright)) %>% 
        mutate(y = ifelse(y == y0, lead(y), y)) %>% 
        distinct(y, y0, .keep_all = TRUE)
    }
  }

  vertDT <- edgeDT %>% 
    group_by(V1) %>% 
    mutate(vmin=min(y), vmax=max(y)) %>% 
    mutate(x=xleft[which(y==min(y))]) %>% 
    select(V1, x, vmin, vmax) %>% 
    distinct()

  return(list("edgeDT"=edgeDT, "vertDT"=vertDT))
}

A function to plot different types of trees based on the extracted tree layout.

plot_from_layout <- function(tree_ly, plot_vert=FALSE) {
  plot(
    1,
    type = 'n',
    axes = TRUE,
    xlim = c(0, max(tree_ly$edgeDT$xright)),
    ylim = c(0, max(c(tree_ly$edgeDT$y, tree_ly$edgeDT$y0))),
    main = 'tree_layout'
  )
  segments(
    x0 = tree_ly$edgeDT$xleft,
    y0 = tree_ly$edgeDT$y0,
    x1 = tree_ly$edgeDT$xright,
    y1 = tree_ly$edgeDT$y
  )
  if (plot_vert) {
    segments(
      x0 = tree_ly$vertDT$x,
      y0 = tree_ly$vertDT$vmin,
      x1 = tree_ly$vertDT$x,
      y1 = tree_ly$vertDT$vmax
    )
  }
}

Testing.

par(mfrow=c(1,2))

layout_from_plot(
  tree = tree.owls,
  type = "p",
  drop_root = FALSE,
  use_vert = TRUE
) %>% 
  plot_from_layout(tree_ly = ., plot_vert = TRUE)

enter image description here

layout_from_plot(
  tree = tree.owls,
  type = "c",
  drop_root = FALSE,
  use_vert = FALSE
) %>%
  plot_from_layout(tree_ly = ., plot_vert = FALSE)

enter image description here

layout_from_plot(
  tree = tree.owls,
  type = "u",
  drop_root = TRUE,
  use_vert = FALSE
) %>%
  plot_from_layout(tree_ly = ., plot_vert = FALSE)

enter image description here

The preferred way, instead of removing edges by hand (as above), would be to unroot the tree first, then plot it. For this to work, you would need to modify the way coordinates are extracted. The main question is how the root node splits the edges and to which branch to add the leftover after removing the root node. By default it appears that this extra edge is added to the one internal edge of the network, but this does not seem right given the phylogram and branch lengths there. See below.

owls.unrooted <- unroot(tree.owls)
layout_from_plot(
  tree = owls.unrooted,
  type = "u",
  drop_root = FALSE,
  use_vert = FALSE
) %>%
  plot_from_layout(tree_ly = ., plot_vert = FALSE)

enter image description here

EDIT 2: Update the layout_from_plot function to handle the y0 coordinates correctly and use unrooted trees instead of manually removing edges.

I realized I was over-complicating with creating the column for the y0 coordinates. All that is needed is to take the yy coordinates ordered by the from the first column of edge from the output of get("last_plot.phylo",envir=.PlotPhyloEnv). Now things work fine with rooted and unrooted trees.

Updated function:

layout_from_plot <- function(tree, type, drop_root=FALSE, use_vert=FALSE) {

  if (drop_root) {
    tree <- unroot(tree)
  }

  plot.phylo(x = tree, type = type)
  title("Plot to get coordinates")
  coords <- get("last_plot.phylo",envir=.PlotPhyloEnv)

  edgeDT <- tibble(
    xright = coords$xx[coords$edge[,2]],
    xleft = coords$xx[coords$edge[,1]],
    y = coords$yy[coords$edge[,2]],
    y0 = coords$yy[coords$edge[,1]],
    edge.length = xright-xleft,
    V1 = coords$edge[,1],
    V2 = coords$edge[,2]
  ) 

  edgeDT <- edgeDT %>%
    arrange(V2) %>%
    mutate(OTU = c(tree$tip.label, rep(NA_character_, coords$Nnode - 1))) %>%
    select(V1, V2, edge.length, OTU, xleft, xright, y, y0)

  if (use_vert) {
    edgeDT <- mutate(edgeDT, y0=y)
  }

  vertDT <- edgeDT %>% 
    group_by(V1) %>% 
    mutate(vmin=min(y), vmax=max(y)) %>% 
    mutate(x=xleft[which(y==min(y))]) %>% 
    select(V1, x, vmin, vmax) %>% 
    distinct()

  return(list("edgeDT"=edgeDT, "vertDT"=vertDT))
}

Additional test code:

layout_from_plot(tree = bird.orders, type = "c", use_vert = FALSE, drop_root = TRUE) %>% 
  plot_from_layout(tree_ly = ., plot_vert = FALSE)

enter image description here

layout_from_plot(tree = bird.orders, type = "u", drop_root = TRUE, use_vert = FALSE) %>% 
  plot_from_layout(tree_ly = ., plot_vert = FALSE)

enter image description here

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
teofil
  • 2,344
  • 1
  • 8
  • 17
  • this is a great start, thank you so you!. But it still isn't obvious how to redraw your tree based on that information in a way that tree_layout() gives you. I guess there is a way using the information from coords_unrooted$xx, coords_unrooted$yy and coords_unrooted$edge, but not sure how. – Sebastien Renaut Sep 21 '19 at 01:55
  • Can you explain what your intended output is? – teofil Sep 21 '19 at 02:03
  • You can easily redraw a rooted tree with tree_layout(). But how to do it with an unrooted tree? (see modified question). – Sebastien Renaut Sep 21 '19 at 23:58
  • @SebastienRenaut Have a look at the edit in my answer. I am curious, why not use one of the existing packages to make these plots? – teofil Sep 22 '19 at 09:35
  • great. thanks. I can reproduce your work using this example. More complex examples seem to produce odd results or error (e.g.: data(bird.orders),plot(bird.orders, type = 'u'). I'll have to spend some time examining layout_from_plot(), to see where things go wrong. – Sebastien Renaut Sep 23 '19 at 19:14
  • I want to redraw a phylogeny using library(plotly) and based on examples I found so far, I need coordinates of phylogenetic tree as input for plotly()... – Sebastien Renaut Sep 23 '19 at 19:16
  • @SebastienRenaut Right, I think I figured out the issue. Test the updated `layout_from_plot` function. Should work. – teofil Sep 23 '19 at 20:30
  • Great thanks. I've tested it with different datasets & it seems to work well for type='u' (which is all I really wanted!!!) and type= 'c'. For type='r', you need to modify the xlim and ylim in plot_from_layout() to xlim = c(min(tree_ly$edgeDT$xright), max(tree_ly$edgeDT$xright)) AND ylim = c(min(tree_ly$edgeDT$xright), max(c(tree_ly$edgeDT$y, tree_ly$edgeDT$y0))) – Sebastien Renaut Sep 26 '19 at 15:52
  • For type="p", you need to make sure that use_vert=T in layout_from_plot() and plot_vert=T in plot_from_layout() – Sebastien Renaut Sep 26 '19 at 15:56
  • please consider submitting a pull request to update the tree_layout() function in https://github.com/joey711/phyloseq/ – Sebastien Renaut Sep 26 '19 at 18:48