Tukey Post Hoc Anova shows unexpected degrees of freedom

Question

I am running an anova in R using lme4 package to investigate the effects of LUT and Climate on EMF (dataset with 50 observations below). Afterwards, to investigate, where the effect occurs, I perform a tukey posthoc analysis using lsmeans package. Surprisingly, the degrees of freedom for the different comparisons are not always 32 (as expected), see screenshot attached. Anyone sees the solution to that?

Results pairwise comparison

contrast        estimate     SE   df t.ratio p.value
 CF amb - EM amb  0.03087 0.0401 32.0   0.770  0.9986
 CF amb - EP amb  0.00557 0.0428 33.2   0.130  1.0000
 CF amb - IM amb -0.04434 0.0401 32.0  -1.107  0.9804
 CF amb - OF amb -0.17754 0.0385 32.6  -4.610  0.0021
 CF amb - CF fut  0.03547 0.0409 39.7   0.867  0.9967
 CF amb - EM fut  0.09094 0.0409 39.7   2.222  0.4595
 CF amb - EP fut  0.07230 0.0409 39.7   1.767  0.7511
 CF amb - IM fut  0.02319 0.0409 39.7   0.567  0.9999
 CF amb - OF fut -0.09254 0.0409 39.7  -2.261  0.4350
 EM amb - EP amb -0.02530 0.0428 33.2  -0.591  0.9998
 EM amb - IM amb -0.07520 0.0401 32.0  -1.877  0.6833
 EM amb - OF amb -0.20841 0.0385 32.6  -5.411  0.0002
 EM amb - CF fut  0.00460 0.0409 39.7   0.112  1.0000
 EM amb - EM fut  0.06008 0.0409 39.7   1.468  0.8966
 EM amb - EP fut  0.04143 0.0409 39.7   1.012  0.9897
 EM amb - IM fut -0.00768 0.0409 39.7  -0.188  1.0000
 EM amb - OF fut -0.12341 0.0409 39.7  -3.015  0.1089
 EP amb - IM amb -0.04990 0.0428 33.2  -1.166  0.9727
 EP amb - OF amb -0.18311 0.0418 35.4  -4.379  0.0035
 EP amb - CF fut  0.02990 0.0436 39.8   0.685  0.9995
 EP amb - EM fut  0.08538 0.0436 39.8   1.957  0.6321
 EP amb - EP fut  0.06673 0.0436 39.8   1.530  0.8720
 EP amb - IM fut  0.01762 0.0436 39.8   0.404  1.0000
 EP amb - OF fut -0.09811 0.0436 39.8  -2.249  0.4426
 IM amb - OF amb -0.13321 0.0385 32.6  -3.458  0.0426
 IM amb - CF fut  0.07980 0.0409 39.7   1.950  0.6368
 IM amb - EM fut  0.13528 0.0409 39.7   3.305  0.0557
 IM amb - EP fut  0.11664 0.0409 39.7   2.850  0.1549
 IM amb - IM fut  0.06753 0.0409 39.7   1.650  0.8155
 IM amb - OF fut -0.04821 0.0409 39.7  -1.178  0.9716
 OF amb - CF fut  0.21301 0.0394 39.0   5.404  0.0001
 OF amb - EM fut  0.26849 0.0394 39.0   6.812  <.0001
 OF amb - EP fut  0.24984 0.0394 39.0   6.339  <.0001
 OF amb - IM fut  0.20073 0.0394 39.0   5.093  0.0004
 OF amb - OF fut  0.08500 0.0394 39.0   2.156  0.5017
 CF fut - EM fut  0.05548 0.0401 32.0   1.385  0.9232
 CF fut - EP fut  0.03683 0.0401 32.0   0.919  0.9946
 CF fut - IM fut -0.01228 0.0401 32.0  -0.306  1.0000
 CF fut - OF fut -0.12801 0.0401 32.0  -3.195  0.0788
 EM fut - EP fut -0.01864 0.0401 32.0  -0.465  1.0000
 EM fut - IM fut -0.06775 0.0401 32.0  -1.691  0.7922
 EM fut - OF fut -0.18349 0.0401 32.0  -4.580  0.0024
 EP fut - IM fut -0.04911 0.0401 32.0  -1.226  0.9624
 EP fut - OF fut -0.16485 0.0401 32.0  -4.115  0.0083
 IM fut - OF fut -0.11574 0.0401 32.0  -2.889  0.1505

Degrees-of-freedom method: kenward-roger 
P value adjustment: tukey method for comparing a family of 10 estimates

# create dataset
# Create dataframe
emf_data <- data.frame(
  Block = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10),
  Plot = c("01_1", "01_2", "01_3", "01_4", "01_5", "02_1", "02_2", "02_3", "02_4", "02_5", "03_1", "03_2", "03_3", "03_4", "03_5", "04_1", "04_2", "04_3", "04_4", "04_5", "05_1", "05_2", "05_3", "05_4", "05_5", "06_1", "06_2", "06_3", "06_4", "06_5", "07_1", "07_2", "07_3", "07_4", "07_5", "08_1", "08_2", "08_3", "08_4", "08_5", "09_1", "09_2", "09_3", "09_4", "09_5", "10_1", "10_2", "10_3", "10_4", "10_5"),
  LUT = c("EM", "CF", "EP", "OF", "IM", "IM", "EP", "OF", "CF", "EM", "CF", "OF", "IM", "EM", "EP", "CF", "OF", "IM", "EM", "EP", "OF", "IM", "EM", "CF", "OF", "EP", "CF", "EM", "OF", "IM", "OF", "EM", "EP", "IM", "CF", "IM", "EP", "EM", "CF", "OF", "EM", "IM", "CF", "EP", "OF", "EP", "EM", "OF", "IM", "CF"),
  Climate = c("amb", "amb", "amb", "amb", "amb", "fut", "fut", "fut", "fut", "fut", "amb", "amb", "amb", "amb", "amb", "fut", "fut", "fut", "fut", "fut", "amb", "amb", "amb", "amb", "amb", "fut", "fut", "fut", "fut", "fut", "fut", "fut", "fut", "fut", "fut", "amb", "amb", "amb", "amb", "amb", "fut", "fut", "fut", "fut", "fut", "amb", "amb", "amb", "amb", "amb"),
  EMF = c(0.386534098, 0.46301213, 0.490665819, 0.721414456, 0.509056576, 0.464901551, 0.428232861, 0.52899388, 0.399248469, 0.36825831, 0.449465674, 0.614773512, 0.489035327, 0.386153492, 0.484446186, 0.541415126, 0.631957258, 0.444547778, 0.375668212, 0.407520486, 0.716481701, 0.536012334, 0.541456363, 0.453584094, 0.393995629, 0.430187565, 0.382135054, 0.384626746, 0.59484382, 0.402980148, 0.516302924, 0.344090374, 0.344659404, 0.455428398, 0.413082427, 0.447645874, 0.412076009, 0.373668918, 0.449430167, 0.703724138, 0.33978796, 0.383338418, 0.353935011, 0.295040775, 0.457778567, 0.404055142, 0.424999363, 0.635655693, 0.507076667, 0.451659421)
)

## install.packages("lme4")
## install.packages("lsmeans")
library(lme4)
library(lsmeans)
model <- lmer(EMF ~ LUT * Climate + (1 | Block), data = emf_data)
emm <- lsmeans(model, ~ LUT * Climate)
pairwise_comp <- pairs(emm, adjust = "tukey")
summary(pairwise_comp)

Link to the dataset: https://drive.google.com/file/d/1zr1Bh7_RxUyYftmKCNiv36rIs92eIGce/view?usp=sharing

By default, the `lmer` function uses the Satterthwaite approximation to estimate the degrees of freedom for the fixed effects. If you specifically want to obtain integer degrees of freedom, you can use the `lmerTest` package. `install.packages("lmerTest")` — Phoenix, Jul 07 '23 at 14:38

score 0 · Answer 1 · answered Jul 08 '23 at 19:09

One of the emmeans vignettes (emmeans is the newer version of lsmeans, and you're encouraged to switch ...) explains that the Kenward-Roger estimate of the denominator degrees of freedom (ddf) is used (this is also stated at the bottom of the printout of your pairwise contrasts):

You will notice that the degrees of freedom are fractional: that is due to the fact that whole-plot and subplot variations are combined when standard errors are estimated. Different degrees-of-freedom methods are available. By default, the Kenward-Roger method is used, and that’s why you see a message about the pbkrtest package being loaded, as it implements that method. We may specify a different degrees-of-freedom method via the optional argument lmer.df: emmeans(Oats.lmer, "nitro", lmer.df = "satterthwaite")

(However, the Satterthwaite approximation will also give you fractional df estimates that vary among contrasts (because different groups have different variances.)

There are other, more traditional methods of estimating ddf ("containment", "between-within"), e.g. as used in nlme::lme(). They do always give integer estimates, and estimates that are identical across groups, but they aren't available in emmeans (they're less popular nowadays because they don't generalize well to unbalanced designs and complex [e.g., crossed] random effect structures).

My advice is not to worry about it too much; for example, the difference between 32 and 39 df in a t-distribution, for a t-statistic of 2.5, is the difference between 2*pt(-2.5, 32) == 0.0177 and 2*pt(-2.5, 39) == 0.0167 ...

Tukey Post Hoc Anova shows unexpected degrees of freedom

1 Answers1