0

I have a Phyloseq object like the following:

Phyloseq Object

My goal is to take a random sample of size n from this object. Even after trying all sampling functions from the Phyloseq package, I am still unable to complete this task. I would have tried other sampling methods but they do not work with Phyloseq objects. I thought about converting it to a dataframe and then sampling, but I am unsure how to convert it back to the same Phyloseq object as before with just fewer rows.

If anyone has a way to take random samples from a phyloseq object, I would greatly appreciate your insight. Thanks!

Detr4
  • 79
  • 1
  • 8

1 Answers1

1

You can randomly sample from the vector of sample_names, and then prune the phyloseq object to those samples.

require("phyloseq")

# Load example data
data("GlobalPatterns")
ps <- GlobalPatterns

# Sample from a physeq object with a sampling function.
#   ps: physeq object to be sampled
#   FUN: function to use for sampling (default `sample`)
#   ...: parameters to be passed to FUN, see `help(sample)` for default parameters
sample_ps <- function(ps, FUN = sample, ...){
  ids <- sample_names(ps)
  sampled_ids <- FUN(ids, ...)
  ps <- prune_samples(sampled_ids, ps)
  return(ps)
}

sample_ps(ps, size=10)
#> phyloseq-class experiment-level object
#> otu_table()   OTU Table:         [ 19216 taxa and 10 samples ]
#> sample_data() Sample Data:       [ 10 samples by 7 sample variables ]
#> tax_table()   Taxonomy Table:    [ 19216 taxa by 7 taxonomic ranks ]
#> phy_tree()    Phylogenetic Tree: [ 19216 tips and 19215 internal nodes ]

Created on 2023-03-08 by the reprex package (v2.0.1)

gmt
  • 325
  • 1
  • 7