2

Predicting values in new data from an lmer model throws an error when a period is used to represent predictors. Is there any way around this?

The answer to this similar question offers a way to automatically write out the full formula instead of using the period, but I'm curious if there's a way to get predictions from new data just using the period.

Here's a reproducible example:

mydata <- data.frame(
    groups = rep(1:3, each = 100),
    x = rnorm(300),
    dv = rnorm(300)
)

train_subset <- sample(1:300, 300 * .8)
train <- mydata[train_subset,]
test <- mydata[-train_subset,]

# Returns an error
mod <- lmer(dv ~ . - groups + (1 | groups), data = train)
predict(mod, newdata = test)
predict(mod) # getting predictions for the original data works

# Writing the full formula without the period does not return an error, even though it's the exact same model
mod <- lmer(dv ~ x + (1 | groups), data = train)
predict(mod, newdata = test)
joshwondra
  • 21
  • 2
  • What does the error message say? – Gregor Thomas Jan 03 '22 at 17:32
  • Error in terms.formula(formula(x, fixed.only = TRUE)): '.' in formula and no 'data' argument Traceback: 1. predict(mod, newdata = test) 2. predict.merMod(mod, newdata = test) 3. get.orig.levs(object, fixed.only = TRUE) 4. terms(object, ...) 5. terms.merMod(object, ...) 6. terms.formula(formula(x, fixed.only = TRUE)) – joshwondra Jan 03 '22 at 17:35
  • 1
    this looks like it might be an lme4 bug ... ?? – Ben Bolker Jan 03 '22 at 17:50

1 Answers1

1

This should be fixed in the development branch of lme4 now. You can install from GitHub (see first line below) or wait a few weeks (early April-ish) for a new version to hit CRAN.

remotes::install_github("lme4/lme4") ## you will need compilers etc.
mydata <- data.frame(
    groups = rep(1:3, each = 100),
    x = rnorm(300),
    dv = rnorm(300)
)

train_subset <- sample(1:300, 300 * .8)
train <- mydata[train_subset,]
test <- mydata[-train_subset,]

# Returns an error
mod <- lmer(dv ~ . - groups + (1 | groups), data = train)
p1 <- predict(mod, newdata = test)

mod2 <- lmer(dv ~ x + (1 | groups), data = train)
p2 <- predict(mod2, newdata = test)
identical(p1, p2) ## TRUE
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453