6

I want to make a "Fixed/Random Models for Generalized linear model" (family="binomial"), because I have a data base where observations come from a population and there is a grouping structure. Then I use the function glmer from the lme4 package, also I have read that I can use the glmmPQL function from library MASS (Faraway,2006).

My problem arises when I want to justify the use of random versus fixed model using the Hausman's test (Greene,2012), I don't find a specific function that allows me to do this similar to the phtest test featured in the package plm.

How to justify using the random model?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Tappin73
  • 137
  • 1
  • 7
  • 1
    can you give a little more information? I found the [Wikipedia page](http://en.wikipedia.org/wiki/Hausman_test) but I'm not sure what b_0 and b_1 represent: are they just the fixed-effect covariates (i.e. not including the grouping variable)? – Ben Bolker May 13 '14 at 14:56
  • thaks for your help! I'm worried because I don't find a specific function within R, especially in the lme4 package to perform this test, so I doubt if it could be done. Or if instead I can justify otherwise the random model best fits the data (anova). There a question similar: http://stats.stackexchange.com/questions/58900/fixed-vs-random-effects – Tappin73 May 13 '14 at 16:56
  • I'm not sure with the explanation of wikipedia, so I show you the Greene definition. "The test is bases that under the hypotesis of no correlation, both OLS, LSDV and FGLS estimators are consistent, but OLS is inefficient, whereas under the alternative, LSDV is consistent, but FGLS is not. Therefore, under the null hypothesis, the two estimates should not differ systematically, and a tes cam be based on the difference"(Greene, 2012). – Tappin73 May 13 '14 at 16:57
  • Example fron Greene: "we retrieved the coefficient vector and estimated asymptotic covariance matrix, b_fe and V_fe from the fixed effects results and the first nine elements of B_re an V_re (excluding the constant term)" [I don't know why nine first elements.] H=(b_fe - B_re)' [V_fe - V_re]^-1 (b_fe - B_re) – Tappin73 May 13 '14 at 16:58
  • the anova won't work particularly well because it is tricky to justify the requirement that one model is nested within the other (as is necessary for the likelihood ratio test to apply). – Ben Bolker May 13 '14 at 19:13

1 Answers1

11

This is a straightforward tweaking of the plm::phtest function. I've commented on the only lines of code that I actually changed. USE AT YOUR OWN RISK and if at all possible test it against the results from plm::phtest. I have just adapted the code, not thought about whether it's really doing the right thing!

phtest_glmer <- function (glmerMod, glmMod, ...)  {  ## changed function call
    coef.wi <- coef(glmMod)
    coef.re <- fixef(glmerMod)  ## changed coef() to fixef() for glmer
    vcov.wi <- vcov(glmMod)
    vcov.re <- vcov(glmerMod)
    names.wi <- names(coef.wi)
    names.re <- names(coef.re)
    coef.h <- names.re[names.re %in% names.wi]
    dbeta <- coef.wi[coef.h] - coef.re[coef.h]
    df <- length(dbeta)
    dvcov <- vcov.re[coef.h, coef.h] - vcov.wi[coef.h, coef.h]
    stat <- abs(t(dbeta) %*% as.matrix(solve(dvcov)) %*% dbeta)  ## added as.matrix()
    pval <- pchisq(stat, df = df, lower.tail = FALSE)
    names(stat) <- "chisq"
    parameter <- df
    names(parameter) <- "df"
    alternative <- "one model is inconsistent"
    res <- list(statistic = stat, p.value = pval, parameter = parameter, 
        method = "Hausman Test",  alternative = alternative,
                data.name=deparse(getCall(glmerMod)$data))  ## changed
    class(res) <- "htest"
    return(res)
}

Example:

library(lme4)
gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
                   data = cbpp, family = binomial)
gm0 <- glm(cbind(incidence, size - incidence) ~ period +  herd,
                   data = cbpp, family = binomial)

phtest_glmer(gm1,gm0)
##  Hausman Test
## data:  cbpp
## chisq = 10.2747, df = 4, p-value = 0.03605
## alternative hypothesis: one model is inconsistent

This seems to work for lme models too:

library("nlme")
fm1 <- lme(distance ~ age, data = Orthodont) # random is ~ age
fm0 <- lm(distance ~ age*Subject, data = Orthodont)
phtest_glmer(fm1,fm0)

## Hausman Test 
## data:  Orthodont
## chisq = 0, df = 2, p-value = 1
## alternative hypothesis: one model is inconsistent
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453