car::Anova in R gives different p-values for TypeII vs TypeIII even though I have a balanced design?

Question

I have tried in vain to find the answer to this question in the other two way ANOVA discussions I have found, so I hope this isn't a repeat.

I have a balanced design (5 observations in each of 4 groups). To my understanding, this means that performing a two-way ANOVA using type I, II, or III sums of squares should all give the same results for the main effects and/or interaction.

My linear model is as such:

mod <- lm(X2HG ~ Genotype * Diet, data = df)

When I run typeII in R using:

Anova(mod, type=2)
Anova Table (Type II tests)
Response: X2HG
                  Sum Sq Df F value   Pr(>F)   
Genotype      2.9033e+11  1  4.5914 0.047854 * 
Diet          8.4475e+11  1 13.3594 0.002136 **
Genotype:Diet 5.5728e+11  1  8.8132 0.009051 **
Residuals     1.0117e+12 16

The F and p-values are VERY different to typeIII:

Anova(mod, type=3)
Anova Table (Type III tests)

Response: X2HG
                  Sum Sq Df F value    Pr(>F)    
(Intercept)   1.5182e+12  1 24.0103 0.0001602 ***
Genotype      2.1568e+10  1  0.3411 0.5673422    
Diet          1.4894e+10  1  0.2355 0.6340278    
Genotype:Diet 5.5728e+11  1  8.8132 0.0090510 ** 
Residuals     1.0117e+12 16

Can anyone explain why the difference? Am I wrong in thinking they should be the same?

EDIT: For further clarity, I get the same output using type 2 and type 3 anova from the bioinfokit package in python so this isn't specific to the car package in R.

Limey · Accepted Answer · 2023-05-13T08:51:48.130

3

No. Type III effects are adjusted for all other terms in the model. Type II effects are adjusted for all terms other than those which contain them. One effect contains another if the other effect can be expressed as a linear combination of the first effect's terms. Thus, interactions contain the main effects of their constituent terms.

That's why you see the statistics for everything except the interaction term changing. The Type I, II and III effects will be equal for balanced data only in the absence of interaction terms.

You can reconstruct the Type III SS using Type I SS by changing the order in which terms are added to the model: the Type III SS for each effect is equal to the Type I SS when that effect is added to the model last. This means you need to fit multiple models to get the Type III effects in this way.

You can do a similar thing to recover the Type II effects from the Type I effects by ensuring that the main effect terms are all fitted before any interaction term.

edited May 13 '23 at 08:51

answered Aug 11 '21 at 11:57

Limey

10,234
2
12
32

Thank you! This makes sense. I was getting confused as certain tutorials such as here: http://www.sthda.com/english/wiki/two-way-anova-test-in-r and here: https://mcfromnz.wordpress.com/2011/03/02/anova-type-iiiiii-ss-explained/ were giving an explanation as follows: "NOTE: when data is balanced, the factors are orthogonal, and types I, II and III all give the same results." Then however, in the latter reference it explains: "due to the way in which the SS are calculated when incorporating the interaction effect, for type III you must specify the contrasts option to obtain sensible results". – jmorrr Aug 11 '21 at 12:31
could you also help with another two way anova q whilst we're here? In a 2*2 factorial design, if there is no significant interaction effect, yet there is a main effect, it is unnecessary to follow up with a post hoc (e.g. Tukey HSD). How then can we explain the effect if one mean within an IV (IV-A) is down compared to the other IV (IV-B) and the other mean for IV-A is up compared to the means for IV-B? Or is this impossible? This has been something that has been bugging me. – jmorrr Aug 11 '21 at 12:45

score 2 · Answer 2 · answered Aug 11 '21 at 12:36

As per this reference: https://mcfromnz.wordpress.com/2011/03/02/anova-type-iiiiii-ss-explained/

First, it is necessary to set the contrasts option in R. Because the multi-way ANOVA model is over-parameterised, it is necessary to choose a contrasts setting that sums to zero, otherwise the ANOVA analysis will give incorrect results with respect to the expected hypothesis. (The default contrasts type does not satisfy this requirement.)"

Anova(lm(X2HG ~ Genotype * Diet, data = df, contrasts=list(Genotype=contr.sum, Diet=contr.sum)), type=3)

Doing this gave the following output:

Anova Table (Type III tests)

Response: X2HG
                  Sum Sq Df  F value    Pr(>F)    
(Intercept)   1.0085e+13  1 159.4960 9.779e-10 ***
Genotype      2.9033e+11  1   4.5914  0.047854 *  
Diet          8.4475e+11  1  13.3594  0.002136 ** 
Genotype:Diet 5.5728e+11  1   8.8132  0.009051 ** 
Residuals     1.0117e+12 16

Which gives the same as the type II and aov() function (which defaults to type I).

car::Anova in R gives different p-values for TypeII vs TypeIII even though I have a balanced design?

2 Answers2