0

Maybe I missed something in how tax_glom works but as I did not find any info here nor elsewhere on the web, maybe someone here can help. I do not provide data but I can on request. Here is the code highlighting the issue I have

colSums(CYANO%>%otu_table())

CYANO_gen <- CYANO %>%
  tax_glom(taxrank = "Genus")
colSums(CYANO_gen%>%otu_table())

CYANO is a phyloseq object that I wanted to agglomerate at the Genus rank but I noticed that a sample (named 100) was not present in a dataviz. This led me to check where the issue happened. 7 samples out of 54 present discrepancies as shown in the last line of the attached image, weird isn't it?

Results given by the code above and 2 additional lines which highlight the importance of discrepancies and the fact that this is not always the case

Thank, Guillaume

GTC
  • 1
  • 2
  • Checking other datasets and having the same issues, I found the reason for the discrepancies which lays in the bad_empty which has to be set to FALSE in tax_glom `tax_glom(taxrank = "Genus", bad_empty=F)` – GTC Apr 14 '22 at 07:14

1 Answers1

0

The NArm term in the tax_glom function is, by default, set as TRUE. To avoid losing observations with NA cells you need to set the NArm = FALSE. Cheers

Zina
  • 1
  • 1