Folks,
I'm trying to fit a model in R using lme() from the {nlme} package with many fixed effects and additional random effects (gotta be {nlme} since I also want to include an AR(1) correlation matrix later on). However, the model is terribly slow (due to the huge number of fixed effects).
Now, we know that running a model with dummies is numerically, but not computationally, equivalent to running a model on the de-meanded data (within estimation). Hence, the following two models lead to equivalent results, but the first one runs much faster:
# Load library
library(nlme) # For lme()
library(lfe) # For demeanlist()
# Create reproducible dummy dataset
set.seed(1)
dat <- data.frame(y=rnorm(1000), x=rnorm(1000), cluster1=sample(LETTERS, 1000, replace=TRUE), cluster2=sample(LETTERS, 1000, replace=TRUE))
# Create de-meaned dataset
dat_demean <- demeanlist(dat[,c("y", "x")], list(dat$cluster1))
dat_demean$cluster2 <- dat$cluster2 # Copy cluster2 column over to de-meaned data
# Model 1: Fixed effect on cluster1
orig <- lm( y ~ x + cluster1, data=dat)
deme <- lm( y ~ x - 1, data=dat_demean)
# This works as expected. Coefficients are exactly identical (SE are wrong and need to be fixed).
all.equal( coef(orig)["x"], coef(deme)["x"] )
R> [1] TRUE
Can something similar be done when we want to additionally include random effects on the second grouping variable? Below is my attempt which does not work.
# Model 2: Fixed effect on cluster1 and additional random effect for cluster2
orig <- lme( y ~ x + cluster1, random=~1|cluster2, data=dat)
deme <- lme( y ~ x -1, random=~1|cluster2, data=dat_demean)
# This does not work - coefficients are different
fixef(orig)["x"]
R> x
R> -0.005885881
fixef(deme)["x"]
R> x
R> -0.006169549
all.equal(fixef(orig)["x"], fixef(deme)["x"])
R> [1] "Mean relative difference: 0.04819478"
What am I missing or what am I doing wrong? Or is it just not that simply possible to fit random effects models on de-meaned data?
Cheers