2

I am new to using Vegan for ecosystem level analysis.

I have a dataset with over 4,000 taxa across ten sites, and another with 37 chem-based observations from all ten sites.

I have analyzed both sets of data using prcomp from the Vegan package, and plotted the results using biplot(df) and as expected it look awesome for the chemistry data but produce (as expected) a ridiculous unreadable plot for the taxa data.

I've read in one website that you can plot a biplot based on loading values of those variables that are driving the plot - in their example they plotted only the top 10 loading variables from their prcomp results for clarity, but they omitted to show how they did it. (See http://www.pmc.ucsc.edu/~mclapham/Rtips/ordination.htm)

After hours of reading and searching for how to plot based on loading (rotation) values I can't find an answer. Can anyone help me figure out how to narrow my variables to those that are the most important instead of looking at all 4,000?

Thank you in advance. Ina

Jari Oksanen
  • 3,287
  • 1
  • 11
  • 15
Ina.Quest
  • 165
  • 3
  • 12
  • This is a pca question more than an R question. In general you can look at your singular value spectrum and see how many principal components are needed to capture most of the variation. The principal component algorithm does the favor of returning them in order. The reference you point to doesn't plot the components themselves, it plots the scores along the main components. The PCA algorithm also does you the favor of returning the components and their scores in eigenvalue order. – PeterK Mar 23 '15 at 16:40
  • Thanks Peter. I guess I am asking how to parse out the values I am getting in R to plot them. I checked and PC1 and 2 account for most of the variation, but this still has hundreds of entries. How can I cut them down to view only the top 100 for example? – Ina.Quest Mar 23 '15 at 16:44
  • Are your rows taxa or chem? Usually it would be taxa in which case you should have 37 factors. – PeterK Mar 23 '15 at 17:47
  • My rows are sites and each column is a different taxa or chem-value. – Ina.Quest Mar 23 '15 at 19:53
  • I intrepreted this question to concern the number of arrows in 2-D plot. You are using basic R commands of `stats` (`prcomp` and its `biplot`). These do not support selecting variables (arrows) displayed in the plots, but you should manually edit the result structure to contain only variables you want to plot. The taxa arrows are in item `rotation`. However, there are "clean plot" functions that do this automatically. I haven't used them and don't know where to find them, but searching R could help. – Jari Oksanen Mar 24 '15 at 12:31

0 Answers0