0

Hello StackOverflow community,

5 weeks ago I learned to write and read R and it made me a happier being :) Stack Overflow helped me out a hundred times or more! For a while I have been struggling with vegan now. So far I have succeeded in making beautiful nMDS plots. The next step for me is DCA, but here I run into trouble...

Let me explain: I have a abundance dataset where the columns are different species (N=120) and the rows are transects (460). Column 1 with transect codes is deleted. Abundance is in N (not relative or transformed). Most species are rare to very rare and a couple of species have very high abundance (10000-30000). Total N individuals is about 100000.

When I run the decorana function it returns this info.

    decorana(veg = DCAMVA) 

    Detrended correspondence analysis with 26 segments.
    Rescaling of axes with 4 iterations.

                    DCA1   DCA2   DCA3   DCA4
    Eigenvalues     0.7121 0.4335 0.1657 0.2038
    Decorana values 0.7509 0.4368 0.2202 0.1763
    Axis lengths    1.7012 4.0098 2.5812 3.3408

The eigenvalues are however really small... Only 1 species has a DCA1 value of 2 the rest is all -1.4E-4 etc... This high DCA1 point has an abundance of 1 individual... But this is not the only species that has only 1 individual..

                                 DCA1      DCA2      DCA3      DCA4 Totals
    almaco.jack              6.44e-04  1.85e-01  1.37e-01  3.95e-02      0
    Atlantic.trumpetfish     4.21e-05  5.05e-01 -6.89e-02  9.12e-02    104
    banded.butterflyfish    -4.62e-07  6.84e-01 -4.04e-01 -2.68e-01     32
    bar.jack                -3.41e-04  6.12e-01 -2.04e-01  5.53e-01     91
    barred.cardinalfish     -3.69e-04  2.94e+00 -1.41e+00  2.30e+00     15
    and so on

I can't plot the picture yet on StackOverflow, but the idea is that there is spread on the Y-axis, but the X-values are not. Resulting in a line in the plot.

I guess everything is running okay, no errors returned or so.. I only really wonder what the reason for this clustering is... Anybody has any clue?? Is there a ecological idea behind this??

Any help is appreciated :) Love Erik

1 Answers1

1

Looks like your data has an "outlier", a deviant site with deviant species composition. DCA has essentially selected the first axis to separate this site from everything else, and then DCA2 reflects a major pattern of variance in the remaining sites. (D)CA is known to suffer (if you want to call it that) from this problem, but it is really telling you something about your data. This likely didn't affect NMDS at all because metaMDS() maps the rank order of the distances between samples and that means it only need to put this sample slightly further away from any other sample than the distance between the next two most dissimilar samples.

You could just stop using (D)CA for these sorts of data and continue to use NMDS via metaMDS() in vegan. An alternative is to apply a transformation such as the Hellinger transformation and then use PCA (see Legendre & Gallagher 2001, Oecologia, for the details). This transformation can be applied via decostand(...., method = "hellinger") but it is trivial to do by hand as well...

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
  • thanks for your comment! I did the DCA without the outlier and it plots much more nice. I don't really know if I can just leave it out.. I also don't really understand why this point is the outlier. There are similar points from my point of view. Sorry for the misunderstanding and thanks for the advice again :) – Erik Houtepen Aug 28 '15 at 22:33
  • 2
    It may be similar "from your point of view", but it certainly is not similar from DCA's point of view. Your points of view differ, and perhaps you should not use a method that has a different point of view. DCA emphasize deviation from the norm (= average profile), and a couple of rare species can make a point deviant. If you use `tabasco(decostand(DCAMVA, "max"), decorana(DCAMVA)` you may see DCA's point of view. – Jari Oksanen Aug 29 '15 at 04:59