5

I am currently writing a script to evaluate the (restricted) log-likelihood function for use in linear mixed models. I need it to calculate the likelihood of a model with some parameters fixed to arbitrary values. Maybe this script is helpful to some of you as well!

I use lmer() from lme4 and logLik() to check whether my script works as it should. And as it seems , it does not! As my educational background wasn't really concerned with this level of statistics, I am a bit lost finding the mistake.

Following, you will find a short example script using the sleepstudy-data:

  # * * * * * * * * * * * * * * * * * * * * * * * * 
  # * example data

  library(lme4)
  data(sleepstudy)
  dat <- sleepstudy[ (sleepstudy$Days %in% 0:4) & (sleepstudy$Subject %in% 331:333) ,]
  colnames(dat) <- c("y", "x", "group")

  mod0 <- lmer( y ~ 1 + x + ( 1 | group ), data = dat)  


  # + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + #
  #                                                             #
  #   Evaluating the likelihood-function for a LMM              #
  #   specified as: Y = X*beta + Z*b + e                        #
  #                                                             #
  # + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 

  # * * * * * * * * * * * * * * * * * * * * * * * * 
  # * the model parameters

  # n is total number of individuals
  # m is total number of groups, indexed by i
  # p is number of fixed effects
  # q is number of random effects

  q <- nrow(VarCorr(mod0)$group)                  # number of random effects
  n <- nrow(dat)                                  # number of individuals
  m <- length(unique(dat$group))                  # number of goups
  Y <- dat$y                                      # response vector

  X <- cbind(rep(1,n), dat$x)                     # model matrix of fixed effects (n x p)
  beta <- as.numeric(fixef(mod0))                 # fixed effects vector (p x 1)

  Z.sparse <- t(mod0@Zt)                          # model matrix of random effect (sparse format)
  Z <- as.matrix(Z.sparse)                        # model matrix Z (n x q*m)
  b <- as.matrix(ranef(mod0)$group)               # random effects vector (q*m x 1)

  D <- diag(VarCorr(mod0)$group[1:q,1:q], q*m)    # covariance matrix of random effects
  R <- diag(1,nrow(dat))*summary(mod0)@sigma^2    # covariance matrix of residuals
  V <- Z %*% D %*% t(Z) + R                       # (total) covariance matrix of Y

  # check: values in Y can be perfectly matched using lmer's information
  Y.test <- X %*% beta + Z %*% b + resid(mod0)
  cbind(Y, Y.test)

  # * * * * * * * * * * * * * * * * * * * * * * * * 
  # * the likelihood function

  # profile and restricted log-likelihood (Harville, 1997)
  loglik.p <- - (0.5) * (  (log(det(V))) + t((Y - X %*% beta)) %*% solve(V) %*% (Y - X %*% beta)  )
  loglik.r <- loglik.p - (0.5) * log(det( t(X) %*% solve(V) %*% X ))

  #check: value of above function does not match the generic (restricted) log-likelihood of the mer-class object
  loglik.lmer <- logLik(mod0, REML=TRUE)
  cbind(loglik.p, loglik.r, loglik.lmer)

Maybe there are some LMM-experts here who can help? In any case your recommendations are greatly appreciated!

edit: BTW, the likelihood function for LMMs can be found in Harville (1977), (hopefully) accessible through this link: Maximum likelihood approaches to variance component estimation and to related problems

Regards, Simon

SimonG
  • 4,701
  • 3
  • 20
  • 31
  • 2
    I **strongly** recommend that you get the development version of `lme4` (probably from github, via `devtools`), which has the capability (`mkDevfunOnly=TRUE`) of returning a deviance function – Ben Bolker Feb 13 '13 at 15:46
  • Thanks! I looked into the github-version of `lme4` and installed it using `devtools`. Is there some further documentation on the `devFunOnly=T` argument and the function it produces? I am particularly interested in the arguments I have to feed to the resulting deviance function, because this again is the most important step for me! – SimonG Feb 14 '13 at 12:47
  • the deviance function returned when \code{devFunOnly} is \code{TRUE} takes a single numeric vector argument, representing the \code{theta} vector. This vector defines the variance-covariance function of the random effects, in the Cholesky parameterization. For a single random effect, this is a single value equal to the standard deviation of the random effect ... – Ben Bolker Feb 14 '13 at 13:51
  • ... For more complex or multiple random effect, running \code{getME(.,"theta")} to retrieve the \code{theta} vector for a fitted model and examining the names of the vector is probably the easiest way to determine the correspondence between the elements of the \code{theta} vector and elements of the lower triangles of the Cholesky factors of the random effects. (I just added this to the documentation. Does it make sense, or can you suggest improvements?) – Ben Bolker Feb 14 '13 at 13:52
  • I forgot to say that theta defines the *scaled* variance-covariance matrix (i.e. relative to the residual variance). – Ben Bolker Feb 14 '13 at 15:58

1 Answers1

2

The solution (as of Mar 2013) was to install the development version of lme4 and make use of the devFunOnly argument.

That development version, along with this capability, is available with lme4 on CRAN since 14-Mar-2014, and the reference guide gives explanations that compliment the package author's (Ben Bolker) comments to the original question.

Thell
  • 5,883
  • 31
  • 55