I'm using lme4
to build a collaborative filter and running into convergence issues. Trying to solve via the following resources and getting a new error:
Error in ans.ret[meth, ] <- c(ans$par, ans$value, ans$fevals, ans$gevals, :
number of items to replace is not a multiple of replacement length
This after the model was running towards convergence for ~ 48 hours.
- https://rstudio-pubs-static.s3.amazonaws.com/33653_57fc7b8e5d484c909b615d8633c01d51.html
https://stats.stackexchange.com/questions/242109/model-failed-to-converge-warning-in-lmer
note: optimx nlmimb seems best, then L-BFGS-B
I have a model that's structured as follows:
library(lme4); library(optimx)
library(stringi)
library(data.table)
set.seed(1423L)
# highly imbalanced outcome variable
y <- sample.int(2L, size= 910000, replace=T, prob= c(0.98, 0.02)) - 1L
# product biases
prod <- sample(letters, size= 910000, replace=T)
# user biases
my_grps <- stringi::stri_rand_strings(n= 35000, length= 10)
grps <- rep(my_grps, each= 26)
x1 <- sample.int(2L, size= 910000, replace=T, prob= c(0.9, 0.1)) - 1L
x2 <- sample.int(2L, size= 910000, replace=T, prob= c(0.9, 0.1)) - 1L
x3 <- sample.int(2L, size= 910000, replace=T, prob= c(0.9, 0.1)) - 1L
x4 <- sample(LETTERS[1:5], size= 91000, replace=T)
dt <- data.table(y= y,
prod= prod, grps= grps,
x1= x1, x2= x2, x3= x3, x4= x4)
lmer1 <- glmer(y ~ -1 + prod + x1 + x2 + x3 + x4 + (1|grps),
data= dt, family= binomial(link= "logit"),
control = glmerControl(optimizer ='optimx', optCtrl=list(method='nlminb')))
I haven't guaranteed that the above data reproduces the error; but that's the model setup. I don't understand the error message at all. Any help would be appreciated
Reported as lmer
issue #425
NOTE: in my true use case, I have closer to 15.5M observations and 30-50 products where each product has a different average response rate (y
)
I have also switched from a kNN approach (typical collaborative filter) to HLM because R is terribly optimized for kNN at scale--should use something like annoy, which I have yet to try out.