0

I am a little new to conducting microbiome analyses and doing it in R. I am interested in performing a SIMPER test to see what species are driving compositional differences in different sampling sites. I was successful in performing the SIMPER test using vegan. I am working with a very large dataset and when I summarize the test summary(simper), it returns thousands of lines of text on the console and I cannot get through it all, and the lines cut off after 1000. I am wondering if it's possible to ONLY return those species of statistical significance?

Here is my code:

pc = read.csv("water.csv", header = TRUE)
com = pc[,6:ncol(pc)]
m_com = as.matrix(com)
simper <- simper(m_com, group = pc$Location, permutations = 999)
summary(simper)

I played around with the 'digits' portion of the 'summary' option (summary(simper, ordered = TRUE, digits = max(3)) but this only changed the number of decimal places in the results. I have also increased the output using options(max.print=1000000) but there is still so much data.

I would like to only view those of significance if possible. Any help is appreciated. Thank you all.

1 Answers1

1

I have no idea how to see 'significant' species, but if you want to display only the species that were reported with p-value at or below a critical value, you can try this:

## assuming simper is the result of simper() as in your post
lapply(simper, function(s) s[s$p <= 0.05,])

If you originally had thousands of species, I personally would not invest much faith on this output. In particular had I read the warnings in the simper documentation (?simper, help(simper) or from your favourite GUI).

Jari Oksanen
  • 3,287
  • 1
  • 11
  • 15
  • Thank you! Yes I meant return any values with a p-value. I tried the code above but it gave me this error: Error in s[s$p <= 0.05, ] : incorrect number of dimensions –  May 15 '23 at 13:56
  • You get this error message if you have no `groups` which also means that you have no _p_-values, or if you have `groups` but `nperm = 0`. Which one is your case? – Jari Oksanen May 16 '23 at 18:32
  • Hi @Jari I was able to get the output I needed using lapply(summary(simper), function(s) s[s$p <= 0.05,]) instead!! Thank you :) –  May 18 '23 at 11:22