Aggregating multiple variables on a data tree with R

Question

I am currently trying to build trees for each of my sample in a metagenomic context.

Here is an example dataset :

library(phyloseq)
library(dplyr)
data(GlobalPatterns)

mydata <- merge(GlobalPatterns@tax_table, GlobalPatterns@otu_table, by = "row.names")
mydata <- mydata %>% dplyr::select(-one_of("Row.names"))
mydata$pathString <- apply(mydata[,1:7], 1, paste, collapse="/")
tree <- data.tree::as.Node(mydata)

The thing is now to aggregate the value at each taxonomic level. Indeed, if I take the values for one sample, they are all at NA exept for the lowest taxonomic level (species) :

print(tree, "CL3")

I used the idea proposed in here : Aggregating values on a data tree with R to perform the aggregation, which is working sample by sample :

myApply <- function(node) {
  node$CL3_2 <- sum(c(node$CL3, purrr::map_dbl(node$children, myApply)), na.rm = TRUE)
}

myApply(tree)
print(tree, "CL3_2")

The idea for me would be to create a function that works for a vector n of samples and that perfom the aggregation. I tried to change the function by adding a "sample" parameter but with no success

# does not work
myApply <- function(node, sample) {
  new_sample <- paste0("new_", sample)
  node$new_sample <- sum(c(node$sample, purrr::map_dbl(node$children, myApply)), na.rm = TRUE)
}

I would use Do to aggregate. See https://stackoverflow.com/a/47110241/4421537 — Christoph Glur, Apr 05 '18 at 09:56
I also tried your method from previous post but didn't manage to get it generalisable for n variables. — Bambs, Apr 05 '18 at 13:18

Aggregating multiple variables on a data tree with R

0 Answers0