1

This is a follow up question to a previous post (How to modify slots lme4 >1.0). I have a similar pairwise data structure and want the random effect to consider both "pops" in the pair. I have a functional random intercept model using the code previously suggested:

dat <- data.frame(pop1 = c(2,1,1,1,1,3,2,2,2,3,5,3,5,4,6),
              pop2 = c(1,3,4,5,6,2,4,5,6,4,3,6,4,6,5), 
              X = c(20,25,18,40,36,70,68,72,78,76,97,100,115,110,108),
              Y = c(18,16,15,40,22,18,18,18,18,45,10,47,67,5,6))
#build random effects matrix
Zl<-lapply(c("pop1","pop2"),function(nm)Matrix:::fac2sparse(dat[[nm]],"d",drop=FALSE))
ZZ<-Reduce("+",Zl[-1],Zl[[1]])
#specify model structure
mod<-lFormula(Y~X+(1|pop1),data=dat,REML=TRUE)
#replace slot
mod$reTrms$Zt <- ZZ
#fit model
dfun<-do.call(mkLmerDevfun,mod)
opt<-optimizeLmer(dfun)
mkMerMod(environment(dfun),opt,mod$reTrms,fr=mod$fr)

However, when attempting to add a random slope variable:

mod2<-lFormula(Y~X+(1+X|pop1),data=dat,REML=TRUE)
mod2$reTrms$Zt <- ZZ
dfun<-do.call(mkLmerDevfun,mod2)

Results in the same error identified in the previous post (where the issue was calling the wrong data frame): "Error in Lambdat %*% Ut : Cholmod error 'A and B inner dimensions must match' at file ../MatrixOps/cholmod_ssmult.c, line 82"

View lm for each pop

plot(1,type="n",xlim=c(0,150),ylim=c(0,75),ylab = "Y",xlab="X")
for(i in 1:length(unique(c(dat$pop1,dat$pop2)))){
  subdat<-dat[which(dat$pop1==i | dat$pop2==i),]
  out<-summary(lm(subdat$Y~subdat$X))
  x=subdat$X
  y=x*out$coefficients[2,1]+out$coefficients[1,1]
  lines(x,y,col=i))
}
legend(125,60,1:6,col=1:6,lty=1,title="Pop")
Community
  • 1
  • 1
L Nathan
  • 13
  • 4
  • It would be really useful if, for future reference, you could give a little bit of context here about the use case for this type of model ... – Ben Bolker Aug 02 '16 at 01:47

1 Answers1

2
dat <- data.frame(pop1 = c(2,1,1,1,1,3,2,2,2,3,5,3,5,4,6),
                pop2 = c(1,3,4,5,6,2,4,5,6,4,3,6,4,6,5), 
                X = c(20,25,18,40,36,70,68,72,78,76,97,100,115,110,108),
                Y = c(18,16,15,32,22,29,32,38,44,45,51,47,67,59,61))

It helps to try to understand what the original code is actually doing:

## build random effects matrix
## 1. sparse dummy-variable matrices for each population ID
Zl <- lapply(dat[c("pop1","pop2")],
       Matrix::fac2sparse,to="d",drop.unused.levels=FALSE)
 ## 2. take the sum of all components of the list of dummy-variable matrices ...
ZZ <- Reduce("+",Zl[-1],Zl[[1]])

The Reduce form is convenient in general if we have a long list, but it helps to see that in this case it's just Zl[[1]]+Zl[[2]] ...

all.equal(Zl[[1]]+Zl[[2]],ZZ) ## TRUE

What does this RE structure look like?

library(gridExtra)
grid.arrange(
   image(t(Zl[[1]]),main="pop 1",sub="",xlab="pop",ylab="obs"),
   image(t(Zl[[2]]),main="pop 2",sub="",xlab="pop",ylab="obs"),
   image(t(ZZ),main="combined",sub="",xlab="RE",ylab="obs"),
   nrow=1)

enter image description here For the random slope, I think we want to take each filled element of ZZ and replace it with the X value observed for the corresponding observation/row of dat: the indexing here is a bit obscure - in this case it boils down to there being 2 filled values in each row of Z/column of Zt (the @p slot of the sparse matrix gives a zero-indexed pointer to the first non-zero element in each column ...)

vals <- dat$X[rep(1:(length(ZZ@p)-1),diff(ZZ@p))]
ZZX <- ZZ
ZZX@x <- vals
image(t(ZZX))

enter image description here

library(lme4)
mod <- lFormula(Y~X+(X|pop1),data=dat,REML=TRUE)
## replace slot
mod$reTrms$Zt <- rbind(ZZ,ZZX)
## fit model
dfun <- do.call(mkLmerDevfun,mod)
opt <- optimizeLmer(dfun)
m1 <- mkMerMod(environment(dfun),opt,mod$reTrms,fr=mod$fr)

This seems to work, but you should certainly check it with your own knowledge of what's supposed to be going on here ...

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • This solved the issue of that error, but the model results don't seem accurate. For instance, using a different data set (I adjusted it in the original post to provide better resolution), the slopes produced are very similar among pops, despite pretty stark differences in the data. Slope coef= 0.11,0.11,0.11,0.08,0.14,0.069. – L Nathan Aug 02 '16 at 17:06
  • I don't understand your use case terribly well, but ... could you just be seeing shrinkage effects? i.e., when you have a small number of populations and a relatively large amount of noise in the simulation, you should *expect* the estimated slopes to differ much less than the observed slopes. – Ben Bolker Aug 02 '16 at 19:07
  • Yes, that is a good point and is likely what is causing the effect I am noticing. Thank you for your help! – L Nathan Aug 03 '16 at 00:14