My question is fairly straightforward. When I run the code below, copied directly from the "Applied Longitudinal Analysis" website, here: https://content.sph.harvard.edu/fitzmaur/ala2e/ (It's Chapter 5, Section 5.7),
library(foreign)
ds <- read.dta("tlc.dta")
ds$baseline <- ds$y0
tlclong <- reshape(ds, idvar="id", varying=c("y0","y1","y4","y6"),v.names="y", timevar="time", time=1:4, direction="long")
tlclong <- subset(tlclong, time > 1)
attach(tlclong)
week <- time
week[time==2] <- 1
week[time==3] <- 4
week[time==4] <- 6
time <- time - 1
week.f <- factor(week, c(1,4,6))
change <- y - baseline
cbaseline <- baseline - 26.406
library(nlme)
model <- gls(y ~ I(week.f==1) + I(week.f==4) + I(week.f==6) + I(week.f==1 & trt=="Succimer") + I(week.f==4 & trt=="Succimer") + I(week.f==6 & trt=="Succimer"), corr=corSymm(, form= ~ time | id), weights = varIdent(form = ~ 1 | week.f))
summary(model)
I get output that is completely different in a few very crucial areas.
This is the output I get:
Generalized least squares fit by REML
Model: y ~ I(week.f == 1) + I(week.f == 4) + I(week.f == 6) + I(week.f == 1 & trt == "Succimer") + I(week.f == 4 & trt == "Succimer") + I(week.f == 6 & trt == "Succimer")
Data: NULL
AIC BIC logLik
2028.922 2076.764 -1001.461
Correlation Structure: General
Formula: ~time | id
Parameter estimate(s):
Correlation:
1 2
2 0
3 0 0
Variance function:
Structure: Different standard deviations per stratum
Formula: ~1 | week.f
Parameter estimates:
1 4 6
1 1 1
Coefficients:
Value Std.Error t-value p-value
(Intercept) 23.646 1.002955 23.576329 0.0000
I(week.f == 1)TRUE 1.014 1.418393 0.714894 0.4752
I(week.f == 4)TRUE 0.424 1.418393 0.298930 0.7652
I(week.f == 1 & trt == "Succimer")TRUE -11.138 1.418393 -7.852550 0.0000
I(week.f == 4 & trt == "Succimer")TRUE -8.556 1.418393 -6.032180 0.0000
I(week.f == 6 & trt == "Succimer")TRUE -2.884 1.418393 -2.033287 0.0429
Correlation:
(Intr) I(.==1 I(.==4 I=1&t=" I=4&t="
I(week.f == 1)TRUE -0.707
I(week.f == 4)TRUE -0.707 0.500
I(week.f == 1 & trt == "Succimer")TRUE 0.000 -0.500 0.000
I(week.f == 4 & trt == "Succimer")TRUE 0.000 0.000 -0.500 0.000
I(week.f == 6 & trt == "Succimer")TRUE -0.707 0.500 0.500 0.000 0.000
Standardized residuals:
Min Q1 Med Q3 Max
-2.3494198 -0.6575048 -0.1467858 0.5279214 6.0826597
Residual standard error: 7.091964
Degrees of freedom: 300 total; 293 residual
And here is the output shown on the website linked above:
Generalized least squares fit by REML
Model: y ~ I(week.f == 1) + I(week.f == 4) + I(week.f == 6) + I(week.f == 1 & trt == "Succimer") + I(week.f == 4 & trt == "Succimer") + I(week.f == 6 & trt == "Succimer")
Data: NULL
AIC BIC logLik
2451.990 2519.544 -1208.995
Correlation Structure: General
Formula: ~time | id
Parameter estimate(s):
Correlation:
1 2 3
2 0.569
3 0.568 0.775
4 0.575 0.581 0.580
Variance function:
Structure: Different standard deviations per stratum
Formula: ~1 | week.f
Parameter estimates:
0 1 4 6
1.000000 1.330103 1.374827 1.529615
Coefficients:
Value Std.Error t-value p-value
(Intercept) 26.406000 0.4998908 52.82354 0.0000
I(week.f == 1)TRUE -1.644501 0.7824044 -2.10186 0.0362
I(week.f == 4)TRUE -2.231356 0.8073811 -2.76370 0.0060
I(week.f == 6)TRUE -2.642065 0.8864616 -2.98046 0.0031
I(week.f == 1 & trt == "Succimer")TRUE -11.340998 1.0931205 -10.37488 0.0000
I(week.f == 4 & trt == "Succimer")TRUE -8.765288 1.1312570 -7.74827 0.0000
I(week.f == 6 & trt == "Succimer")TRUE -3.119869 1.2507776 -2.49434 0.0130
Correlation:
(Intr) I(.==1 I(.==4 I(.==6 I=1&t="
I(week.f == 1)TRUE -0.155
I(week.f == 4)TRUE -0.136 0.674
I(week.f == 6)TRUE -0.068 0.381 0.380
I(week.f == 1 & trt == "Succimer")TRUE 0.000 -0.699 -0.467 -0.265
I(week.f == 4 & trt == "Succimer")TRUE 0.000 -0.466 -0.701 -0.265 0.667
I(week.f == 6 & trt == "Succimer")TRUE 0.000 -0.263 -0.263 -0.705 0.376
I=4&t="
I(week.f == 1)TRUE
I(week.f == 4)TRUE
I(week.f == 6)TRUE
I(week.f == 1 & trt == "Succimer")TRUE
I(week.f == 4 & trt == "Succimer")TRUE
I(week.f == 6 & trt == "Succimer")TRUE 0.375
Standardized residuals:
Min Q1 Med Q3 Max
-2.1636401 -0.7011814 -0.1426534 0.5374840 5.6570302
Residual standard error: 4.998908
Degrees of freedom: 400 total; 393 residual
Note that
- The AIC, BIC, and LogLik values are all different
- The correlation output is completely different. My output gives all zeros, and is missing an entire dimension in the correlation matrix.
- My variance function parameter estimates are completely different; also missing the 0 output shown on the website.
- The estimated coefficients are wildly different in most cases; also my output is completely missing the week.f == 6 results.
- The correlation matrix outputted below the coefficients results is completely different.
- My output says 300 degrees of freedom total; Fitzmaurice's says 400 total (the rest of that last section of output is obviously different as well).
Also, when I run the code, I get an error saying
Error in glsEstimate(object, control = control) :
computed "gls" fit is singular, rank 7
So, to even get any output at all, I had to add this argument to the gls() function:
control = list(singular.ok = TRUE)
So my question is, why is the output so different when I run the code myself? Clearly whoever made the website neglected to copy the code correctly.
If you would like to try running the code yourself, the website linked above has the dataset tlc.dta available for download.