0

I have a large panel dataset with ~ 2000 individuals and ~ 15000 observations (person/year). I have set of time-varying and non-time varying variables and and binary outcome variable (0/1). I am trying to do a multi-level discrete survival analysis using glmer using "lme4" package.

id = Individual ID, survtime = # of years survival before event/censoring

I couldn't produce a reproducible example with such large dataset but here is my code,

Modelsurv <- glmer(formula = outcome ~ survtime + var1(discrete-timevarying) + var2(discrete-timevarying) + var3(dummy-non timevarying) + var4(4 level categorical-non timevarying)+ (1|id),
family = binomial(cloglog),
data = dataset,
control = glmerControl(optimizer = "bobyqa",
optCtrl = list(maxfun = 2e5)))

I am trying to replicate this example here. See point 8 (multilevel discrete-time survival analysis). I am not understanding what the code control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e5 is doing and in my case with such large data, how and what to set this?

I tried using the above code but get the following message error.

  unable to evaluate scaled gradient
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge: degenerate  Hessian with 1 negative eigenvalues

Can anyone help me understand and guide me in this? And will I have to make iteration based on number and kind of variables I add into the model?

Thank you !

rais
  • 81
  • 6

1 Answers1

0

As far as iterations go ... the linked document says:

you can normally ignore the control argument. It is used here to increase the maximum iteration.

As far as the warning (not an error!) goes: the warning is telling you that the fit is numerically suspect (which does not mean it's wrong, just that you should check some things!). From ?convergence:

use ‘allFit’ to try the fit with all available optimizers (e.g. several different implementations of BOBYQA and Nelder-Mead, L-BFGS-B from ‘optim’, ‘nlminb’, ...). While this will of course be slow for large fits, we consider it the gold standard; if all optimizers converge to values that are practically equivalent, then we would consider the convergence warnings to be false positives.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Thank you Ben ! Could you explain me what the "maxfun = 2e5" is doing and if there is a way to fix that iteration? It is arbitrary for now and if I change it, the model's effect size and significance changes. So I was not sure if I am doing it right Vs just ignoring it. – rais Mar 15 '22 at 14:47
  • You probably should have stated that in your question. Are you getting a warning (" convergence code 1 from bobyqa: bobyqa -- maximum number of function evaluations exceeded") ? – Ben Bolker Mar 15 '22 at 14:52
  • Sorry about the confusion. Here is the warning (```Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge with max|grad| = 4.09063 (tol = 0.002, component 1) Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue - Rescale variables?;Model is nearly unidentifiable: large eigenvalue ratio - Rescale variables?```) – rais Mar 15 '22 at 15:06
  • I'm surprised that the following two things are simultaneously true: (1) changing `maxfun` changes your results and (2) you're not getting any warnings about "maximum number of function evaluations exceeded" – Ben Bolker Mar 15 '22 at 15:20
  • my sincere apologies Ben. I realized that it is not because of the "maxfun", but because of how I entered my variables !As for the warning, any suggestion or fine to just ignore it? – rais Mar 15 '22 at 15:32
  • See my answer, and the `?lme4::convergence` help page. – Ben Bolker Mar 15 '22 at 15:37