5

I have a dataset with which I would like to compare the effect of species and habitat on homerange size - while using type III errors and pairwise comparisons within species and habitat.
Here's a subset of the data:

species<- c("a","b","c","c","b","c","b","b","a","b","c","c","a","a","b","b","a","a","b","c")
    habitat<-  c("x","x","x","y","y","y","x","x","y","z","y","y","z","z","x","x","y","y","z","z")
    homerange<-c(6,5,7,8,9,4,3,5,6,9,3,6,6,7,8,9,5,6,7,8)
    data1<-data.frame(cbind(species, habitat, homerange))
    data1$homerange<-as.numeric(as.character(data1$homerange))    

Currently I am spltting up the data on the three species, then running separate ANOVAs for each, but I believe it makes more sense to ask about species and habitat at the same time with one ANOVA. Here's an example of the ANOVA I ran for one species:

data.species.a<-subset(data1, species=="a")
fit<-aov(homerange ~ habitat, data=data.species.a)
summary(fit)
TukeyHSD(fit)

aov() appears to use type I errors . . . which I don't think are appropriate; plus I believe Tukey's test may be too conservative an approach for the pairwise comparisons. Can someone help me with an approach that allows me to run one ANOVA that considers both the effect of species and habitat on homerange, with type III errors, that also permits a less-conservative pairwise comparisons of species and habitat?

zx8754
  • 52,746
  • 12
  • 114
  • 209
Luke
  • 4,769
  • 5
  • 28
  • 36
  • Might be better suited for [CrossValidated.](http://stats.stackexchange.com/) They answer [general R questions,](http://meta.stats.stackexchange.com/questions/1/how-to-answer-r-questions) too, but this also seeks modeling advice. – Christopher Sep 05 '12 at 19:24
  • I think you mean type III sums of squares. – Dason Sep 05 '12 at 19:49
  • I'm a little confused. For a model with a single predictor variable, tests based on type I, II, III SSQ give *identical* results (compare the results of your `aov` with @DWin's answer below and with a minor variant of his answer where you would use `Anova(...,type="II")`. Did you mean to test `aov(homerange ~ habitat*species)` above? I second the CrossValidated suggestion. – Ben Bolker Sep 05 '12 at 21:35

1 Answers1

6

You can set up Anova in package 'car' to report type III sums of squares and there is an HSD.test in package 'agricolae' that should be able to take that model object as input. I do not think you can legitimately use aov() with your data being unbalanced, so I am doing it with an lm() fit.

fit<-lm(homerange ~ habitat, data=data.species.a)
require(car)
 Anova(fit, type="III")
require(agricolae)
comparison <- HSD.test(fit, "habitat", group=TRUE)

Note that the SAS default of type-III sums of square is viewed with disdain (and sometimes even outright derision) by the authors of the R base package (read this for more details). The presentation of that method in package 'car' is mainly for purposes of comparison, rather than being a recommendation regarding statistical correctness.

To add citations to the reasons for being very cautious about accepting the SAS-standard: Frank Harrell's comments re: loss of power and Bill Venables' later comments in the same thread on r-help

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • how does `HSD.test` consider the selected type of sums of squares in `Anova`? Or one has to run `Anova` with SS2 or 3 and with significant effects do`HSD.test` with `unbalanced = TRUE` option? – abc Jan 25 '21 at 10:07
  • `HSD.test` doesn't "know" that you previously asked for type-III SS, since the fit object was produced by `lm` and not modified by the `Anova`-call. This is not a good example to use since the SS-table is the same for type-II and type-II SS in this case, except for the intercept. I'm not sure what your question is really about. Are you asking how the code does its work? The code is easily produced by just typing the name of the function at the console (although in Rstudio you need to type an extra space after the last letter in the function to prevent it from adding the parentheses. – IRTFM Jan 25 '21 at 17:20
  • 1
    my question was how to let `agricolae::HSD.test` know the SS type used in `car::Anova` I use for (highly) unbalanced design? Is it necessary? It's warned that `stats::TukeyHSD` by default uses SS1 from `stats::aov` [Multiple Comparisons](https://www.statmethods.net/stats/anova.html). Or the correct way is just: aov or lm > Anova with aov or lm > select p<0.05 effects in Anova > HSD.test with selected effects ? – abc Jan 26 '21 at 09:44
  • HSD.test does NOT use the results of Anova. – IRTFM Jan 26 '21 at 22:50