1

Why am I getting this warning when running my GLM? pcurv is squared

“glm.fit: fitted probabilities numerically 0 or 1 occurred”

>summary(df.1)
 lon              lat           Roughness           Pcurv         
 Min.   :-71.00   Min.   :-29.98   Min.   :-0.7575   Min.   :-6.62627  
 1st Qu.:-68.70   1st Qu.:-23.90   1st Qu.:-0.7048   1st Qu.:-0.08573  
 Median :-67.34   Median :-19.13   Median :-0.4133   Median : 0.28108  
 Mean   :-66.62   Mean   :-20.71   Mean   : 0.0000   Mean   : 0.00000  
 3rd Qu.:-65.11   3rd Qu.:-17.45   3rd Qu.: 0.4076   3rd Qu.: 0.32911  
 Max.   :-60.15   Max.   :-14.07   Max.   : 4.7961   Max.   : 6.09728  
 Mean.MIN.Temp     Mean.MAX.Temp     Precipitation          pres    
 Min.   :-1.7400   Min.   :-1.9045   Min.   :-0.8101   Min.   :0.0  
 1st Qu.:-0.7141   1st Qu.:-0.7716   1st Qu.:-0.6943   1st Qu.:0.0  
 Median :-0.4810   Median :-0.3799   Median :-0.4338   Median :0.5  
 Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000   Mean   :0.5  
 3rd Qu.: 1.0841   3rd Qu.: 1.0467   3rd Qu.: 0.5794   3rd Qu.:1.0  
 Max.   : 1.7223   Max.   : 1.8859   Max.   : 4.3427   Max.   :1.0  
      grp   
 Min.   :1  
 1st Qu.:2  
 Median :3  
 Mean   :3  
 3rd Qu.:4  
 Max.   :5  

My model code:

mdl.glm <- glm(pres~Roughness+Pcurv*I(Pcurv^2)+Mean.MIN.Temp+Mean.MAX.Temp, family=binomial(link=logit), data=subset(df.1,grp!=1))
Common_Codin
  • 103
  • 1
  • 5

1 Answers1

5

The warning means that when R is computing probabilities internally, as part of the fitting process, they sometimes "underflow/overflow" - that is, they're so close to 0 or 1 that they can't be distinguished from them when using R's standard 64-bit floating-point precision (e.g. values less than about 1e-308 or greater than about 1-1e-16).

There's not much you can do about this, it usually has to do with the structure of your data. You may be able to improve the numerical stability of fitting and avoid the warning by the following two general strategies:

  • center and scale all of your continuous variables (e.g. using the scale() function). (This will change the numerical values of your coefficients, but not their p-values [except for the intercept], and won't affect the overall fit of the model (R^2, predictions, etc.) at all.)
  • use an orthogonal polynomial poly(Pcurv,2) rather than the "raw" quadratic Pcurv + I(Pcurv^2). (This will change these two parameters and their p-values etc. but again won't affect the overall fit of the model.)

If you can't make the warning go away, I would check the following issues carefully:

  • do you have symptoms of complete separation, i.e. large (absolute) parameter values (e.g. |beta|>8) and possibly ridiculously large standard errors/p-values? See e.g here, here, here for suggestions on what to do about it (e.g. replace Wald tests with likelihood ratio tests; Firth regression via brglm2; Bayesian models with regularizing priors)
  • Do your fits generally make sense, i.e. diagnostics look OK (e.g. see DHARMa package), predicted values are sensible? (Of course this is something you should always check, but do it extra-carefully)

After that, ignore the warnings and move on with your analysis.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453