-1

I’m trying to do an ANCOVA here ... I want to analyze the effect of EROSION FORCE and ZONATION on all the species (listed with small letters) in each POOL.STEP (ranging from 1-12/1-4), while controlling for the effect of FISH.

screenshot

I’m not sure if I’m doing it right. What is the command for ANCOVA?

So far I used lm(EROSIONFORCE~ZONATION+FISH,data=d), which yields:

screenshot2

So what I see here is that both erosion force percentage (intercept?) and sublittoral zonation are significant in some way, but I’m still not sure if I’ve done an ANCOVA correctly here or is this just an ANOVA?

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
metazoa
  • 55
  • 1
  • 1
  • 9
  • 1
    this is a question about *interpretation* of a statistical analysis, so I've voted to close/move to [CrossValided](http://stats.stackexchange.com) – Ben Bolker Oct 22 '16 at 20:41
  • How about just help me interpretate right here? – metazoa Oct 22 '16 at 21:10
  • I try to avoid answering questions that are inappropriate for SO - in the long run, answering questions that aren't good fits for SO encourages people to asks more. If you want to re-post this question to CrossValidated yourself, and delete the question here (you can comment here with the link to the new question - I can see your question + comments after it's deleted) I will consider answering there ... – Ben Bolker Oct 22 '16 at 21:45
  • actually, now that I look at your question, you're farther off than I thought, so I'll give a bit of an answer here (it just morphed back into a programming question) – Ben Bolker Oct 22 '16 at 21:46

1 Answers1

4

In general, ANCOVA (analysis of covariance) is simply a special case of the general linear model with one categorical predictor (factor) and one continuous predictor (the "covariate"), so lm() is the right function to use.

However ... the bottom line is that you have a moderately challenging statistical problem here, and I would strongly recommend that you try to get local help (if you're working within a research group, can you consult with others in your group about appropriate methods?) I would suggest following up either on CrossValidated or r-sig-ecology@r-project.org

  • by putting EROSIONFORCE on the left side of the formula, you're specifying that you want to use EROSIONFORCE as a response (dependent) variable, i.e. your model is estimating how erosion force varies across zones and for different fish numbers - nothing about species response
  • if you want to analyze the response of a single species to erosion and zone, controlling for fish numbers, you need something like
lm(`Acmaeidae s...` ~ EROSIONFORCE+ZONATION+FISH, data=your_data)
  • the lm() suggestion above would do each species independently, i.e. you'd have to do a separate analysis for each species. If you also want to do it separately for each POOL.STEP you're going to have to do a lot of separate analyses. There are various ways of automating this in R, the most idiomatic is probably to melt your data (see reshape2::melt or tidy::gather) into long format and then use lmList from lme4.
  • since you have count data with low means, i.e. lots of zeros (and a few big values), you should probably consider a Poisson or negative binomial model, and possibly even a zero-inflated/hurdle model (i.e. analyze presence-absence and size of positive responses separately)
  • if you really want to analyze the joint distribution of all species (i.e., a response of a multivariate analysis, which is the M in MANOVA), you're going to have to work quite a bit harder ... there are a variety of joint species distribution models by people like Pierre Legendre, David Warton and others ... I'd suggest you try starting with the mvabund package, but you might need to do some reading first
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • I think that what I need to do is to flip my formula the other way around! I'd feared that I'd have to do single analyses for each specie, which – metazoa Oct 23 '16 at 09:18