0

I have a large motor insurance dataset with which I want to make a mixed model regression to model the expected claim frequency using glmmTMB, with the purpose of determining an initial base premium.

My script looks like this:

glmmTMB(response ~ Var1 + Var2 + Var3 + ... +
   offset(log(exposure_level) + (1|policy_id), 
       data = data, 
       family = nbinom1(link = "log")) 

No matter what I do I get warnings regarding NaN and convergence and the p-value, std, z value, AIC, BIC, logLik and deviance in the summary are all NaN.

I get the following warnings:

Warning messages:
1: In .checkRankX(TMBStruc, control$rank_check) : fixed effects in conditional model are rank deficient
2: In (function (start, objective, gradient = NULL, hessian = NULL, : NA/NaN function evaluation
3: In (function (start, objective, gradient = NULL, hessian = NULL, : NA/NaN function evaluation
4: In fitTMB(TMBStruc) : Model convergence problem; non-positive-definite Hessian matrix. See vignette('troubleshooting')
5: In fitTMB(TMBStruc) : Model convergence problem; false convergence (8). See vignette('troubleshooting')

I have tried grouping the data more and leaving out variables, but it does not seem that I can fix the issues. No matter what I try the warnings and NaN still shows.

Has anyone experienced the same and know how to solve the problem?

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
Kat
  • 1
  • 1
  • What warnings do you get, specifically? Can you edit your question to include them, and the output of `summary()`? Have you tried running `diagnose()` on your fitted model? – Ben Bolker May 17 '23 at 12:44

1 Answers1

0

The key is

In .checkRankX(TMBStruc, control$rank_check) : fixed effects in conditional model are rank deficient

If you add the argument control = glmmTMBControl(rank_check = "adjust") to your model fit, it should take care of this automatically.

This means that you have multicollinear columns in your set of predictor variables (there are lots of reasons this happens: interactions with some combinations missing, dummy variables that completely describe a set of categorical options, constant columns ...)

For example:

library(glmmTMB)
data("sleepstudy", package = "lme4")
ss <- transform(sleepstudy, Days2 = Days)
m <- glmmTMB(Reaction ~ Days + Days2 + (1|Subject),
   data = ss,
   control = glmmTMBControl(rank_check = "adjust"))

which gives the message (not a warning any more)

dropping columns from rank-deficient conditional model: Days2

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Thank you for this advice, this worked for the first warning. Unfortunately the last four warning still comes up and the summary still contains NA and NaN. – Kat May 18 '23 at 06:23