0

This is possibly a stupid question, but I was told to do a Redundancy Analysis in R (using the package Vegan) to test the differences between different groups in my data. However I only have one dataset (roughly comparable to the Iris dataset (https://en.wikipedia.org/wiki/Iris_flower_data_set)), and everything I have found on RDA seems to need two matching sets. Did I mishear or misunderstand, or is there something else going on here?

Anthony BH
  • 11
  • 2

1 Answers1

1

As far as the underlying statistics are concerned, you have two data matrices;

  1. the four morphological variables in the iris data set
  2. a single categorical predictor variable or constraint

In vegan using rda() for this and the iris example data you'd do:

library("vegan")
iris.d <- iris[, 1:4]
ord <- rda(iris.d ~ Species, data = iris)
ord

set.seed(1)
anova(ord)

The permutation test, tests for differences between species.

> anova(ord)
Permutation test for rda under reduced model
Permutation: free
Number of permutations: 999

Model: rda(formula = iris.d ~ Species, data = iris)
          Df Variance      F Pr(>F)    
Model      2   3.9736 487.33  0.001 ***
Residual 147   0.5993                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

You might also look at adonis(), which should do the same thing here as RDA but from a different view point:

> adonis(iris.d ~ Species, data = iris)

Call:
adonis(formula = iris.d ~ Species, data = iris) 

Permutation: free
Number of permutations: 999

Terms added sequentially (first to last)

           Df SumsOfSqs MeanSqs F.Model      R2 Pr(>F)    
Species     2   2.31730 1.15865  532.74 0.87876  0.001 ***
Residuals 147   0.31971 0.00217         0.12124           
Total     149   2.63701                 1.00000           
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(For some reason that is a lot slower...)

Also see betadisper() as you might detect a difference in means (centroids) using these methods where that may be due at least in part to differences in variance (dispersion).

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
  • Hi, thank you so much! I've tried this code out with the iris dataset and am having some troubles - After: ord <- rda(iris.d ~ Species, data = iris) I keep getting the error: Error in model.frame.default(formula = iris.d ~ Species, data = iris) : invalid type (list) for variable 'iris.d' – Anthony BH Jul 31 '15 at 16:18