I originally posted this question on Cross Validated Stackexchange, and got no answer. Therefore I decided to give it a go here. I am trying to figure out how to obtain lsmeans for a piecewise linear mixed-effects model (fitted with the nlme package) with random intercepts and slopes. My data represent math scores from a group of male and female students taking a test every week before and after the introduction of a daily routine of meditation. A minimal reproducible example to create the data frame and fit the model is as follows:
library(nlme)
library("lsmeans")
# Subject's ID
ID <- c(1,1,1,
2,2,2,
3,3,3,
4,4,4,
5,5,5,
6,6,6,
7,7,7,
8,8,8)
# Time (weeks) before introduction of routine
time1 <- c(-1,0,0,
-1,0,0,
-1,0,0,
-1,0,0,
-1,0,0,
-1,0,0,
-1,0,0,
-1,0,0)
# Time (weeks) before introduction of routine
time2 <- c(0,0,1,
0,0,1,
0,0,1,
0,0,1,
0,0,1,
0,0,1,
0,0,1,
0,0,1)
# week test math scores
mscore <- c(80,92,73,
75,80,85,
60,75,70,
75,80,75,
78,84,75,
78,91,95,
64,72,71,
84,92,70)
# create dataframe
longdata <-data.frame(ID, time1, time2, mscore)
head(longdata)
# fit model
pwmodel <- lme(mscore ~ time1+time2,
random =~ time1+time2|ID,
data=longdata,
method="ML")
# calculate marginal means:
#with both variables
lsmeans(pwmodel, ~(time1+time2),
at=list(time1=c(-1,0,1), time2=c(-1,0,1)) )
# only with one variable
lsmeans(pwmodel, ~time1,
at=list(time1=c(-1,0,1) ))
Here time1 and time2 represent the time before and after the start of the daily meditation routine.
The question is: what is the correct way to obtain lsmeans (or emmeans if that is better) from this model at times -1, 0 and 1? Considering the two time variables or only one of them (either time1 or time2)?
The outputs of both approaches are shown below:
> #with both variables
> lsmeans(pwmodel, ~(time1+time2),
+ at=list(time1=c(-1,0,1), time2=c(-1,0,1)) )
time1 time2 lsmean SE df lower.CL upper.CL
-1 -1 80.8 5.46 7 67.8 93.7
0 -1 89.8 5.47 7 76.8 102.7
1 -1 98.8 5.81 7 85.0 112.5
-1 0 74.2 2.88 7 67.4 81.1
0 0 83.2 2.77 7 76.7 89.8
1 0 92.2 3.28 7 84.5 100.0
-1 1 67.8 3.34 7 59.9 75.6
0 1 76.8 3.12 7 69.4 84.1
1 1 85.8 3.47 7 77.5 94.0
Degrees-of-freedom method: containment
Confidence level used: 0.95
> # only with one variable
> lsmeans(pwmodel, ~time1,
+ at=list(time1=c(-1,0,1) ))
time1 lsmean SE df lower.CL upper.CL
-1 71 2.59 7 64.9 77.1
0 80 2.38 7 74.4 85.6
1 89 2.89 7 82.2 95.8
Results are averaged over the levels of: time2
Degrees-of-freedom method: containment
Confidence level used: 0.95
They clearly return different results, but shouldn't both ways give the same value?