I am trying to replicate a textbook example of a Tukey Test for One-Way Mixed Effect ANOVA (from Statistics, William L. Hays p 581-583) but the p-values I am getting using lme & glht don't make sense
The study has repeated measures of four treatments and 10 subjects
The Data
subject=c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6,
6, 7, 7, 7, 7, 8, 8, 8, 8, 9, 9, 9, 9, 10, 10, 10, 10)
treatment=c("a1", "a2", "a3", "a4", "a1", "a2", "a3", "a4", "a1", "a2", "a3",
"a4", "a1", "a2", "a3", "a4", "a1", "a2", "a3", "a4", "a1", "a2",
"a3", "a4", "a1", "a2", "a3", "a4", "a1", "a2", "a3", "a4", "a1",
"a2", "a3", "a4", "a1", "a2", "a3", "a4")
response=c(11, 9, 5, 17, 14, 12, 10, 18, 15, 7, 6, 21, 17, 10, 13, 22, 15, 7, 6,
15, 14, 8, 13, 22, 9, 6, 7, 15, 17, 11, 10, 19, 10, 13, 14, 23, 12,
8, 11, 20)
dataFrame=data.frame(subject, treatment, response)
The model
library(nlme)
model = lme(response~ treatment,random=~1|subject,data=dataFrame)
anova(model)
numDF denDF F-value p-value
(Intercept) 1 27 375.9198 <.0001
treatment 3 27 43.4507 <.0001
This F-value is close enough to Hay's (F = 43.41
) that I am pretty sure my model is fine.
The Tukey Test
library(multcomp)
glht.out =glht(model, mcp(treatment="Tukey"))
summary(glht.out)
Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: Tukey Contrasts
Fit: lme.formula(fixed = response ~ treatment, data = dataFrame, random = ~1 |
subject)
Linear Hypotheses:
Estimate Std. Error z value Pr(>|z|)
a2 - a1 == 0 -4.300 1.006 -4.276 <0.001 ***
a3 - a1 == 0 -3.900 1.006 -3.879 <0.001 ***
a4 - a1 == 0 5.800 1.006 5.768 <0.001 ***
a3 - a2 == 0 0.400 1.006 0.398 0.979
a4 - a2 == 0 10.100 1.006 10.044 <0.001 ***
a4 - a3 == 0 9.700 1.006 9.647 <0.001 ***
This is not consistent with the book. Hay's only reported some comparisons and gave HSD and mean, not p but for the a3-a1 contrast found there was no significant difference with an HSD = 4.02
and mean = 3.9
which by my calculation has p.value = 1-ptukey(3.9*sqrt(10/8.27),4,9)=0.05719563
.
The R output also doesn't make sense because the Tukey test p-value, which is supposed to control for multiple comparisons, is much much smaller than the p-value from a paired t-test (p=0.0142 using t.test(c(11, 14, 15, 17, 15, 14, 9, 17, 10, 12),c(5, 10, 6, 13, 6, 13, 7, 10, 14, 11), paired=TRUE)
.
Any idea what I am doing wrong and how I can correctly perform a Tukey test in R?