5

I am studying impact of various characteristics on court decission on specific offences. The dataset is pretty large (28928 observations with 86 level-2 units). I am looking at the decision whether to incarcerate someone or not (=binary outcome variable) using level1 and level2 characteristics as controls (level1 are in capitals).

This is my code:

MLmodel196a_2 <- glmer(NEPO_ANO_NE ~ 
                     OZNACENY_RECIDIVISTA_REG + POCET_DRIV_ODSOUZENI_REG +
                     ROK_ODSOUZENI_REG + OMEZENI_A_POVINNOST_REG +
                     POCET_HLAVNICH_LICENI + DRUH_ZAHAJENI_RIZENI_REG + 
                     NOVELA_REG + ODSTAVEC_REG +
                     EU_OBCANSTVI + POHLAVI_REG + VEK_SPACHANI_REG +

                     objasnenost_procenta + kriminalita_relativni_REG +
                     venkov_mesto + socialni + nezamestani_celkem + 
                     vzdelani_zakladni_procenta +
                     prumerny_vek + podil_15az24_muzu_procenta +
                     zenati_vsichni_procenta + 
                     verici_procenta + volby_ucast +
                     (1 | Nazev_soudu), family = binomial, data = vyber196) 

When I run this, I receive this error:

Error: (maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate

If I run this analysis for a different dataset (different offence), it produces results with several warnings. If I run this dataset only with level1 control variables, it again produces results with several warnings.

The majority of level1 variables are categorical, the level2 variables are all continuous (not being scaled).

Unfortunately I cannot provide any data since the data were provided by the government under such condition.

I do not understand, why this happens only for this offence and not the the other offences. Is there a way around it?

(lme4 version 1.1-12, R version 3.3.1)

Jakub Drapal
  • 233
  • 2
  • 3
  • 8
  • 2
    Potentially, it could be caused by quasi separation - the fact that 0 or 1 occurred during prediction. This could be e.g. caused by having categorical variables with too many levels. You can try fitting the model without categorical variables and if it works, You can work out what variable is causing the problem and either drop it or relevel it. – Love-R Jun 24 '16 at 15:28
  • Thank You. However, I am not sure, whether this is the reason. When I fit only the unit1 characteristics, it works. If I fit a different dataset with very similar unit1 characteristics and the same unit2 characteristics, it goes through (with warning messages). – Jakub Drapal Jun 24 '16 at 15:58
  • What are the warning messages? The fact that quasi-separation did not occur on a different dataset with the same variables does not apply that it does not occur on this dataset. – Love-R Jun 24 '16 at 16:02
  • 1
    Another potential reason for quasi-separation could be large values of numeric variables. Look for outliers, cap/floor/scale Your numeric variables. – Love-R Jun 24 '16 at 16:09
  • Now I am getting the result, thank You a lot. There was one variable I did not expect at all to play such role, my bad. Concerning the warnings, I would post it as a separate thread which I cannot do now since I am not allowed... – Jakub Drapal Jun 24 '16 at 16:52
  • I am glad that it worked out, barte. – Love-R Jun 24 '16 at 18:26
  • 1
    If you have found a solution to your own question, you are encouraged to post it as an answer. – Ben Bolker Jun 25 '16 at 00:45
  • Yes, sorry: After removing one of the continuous variable, it worked out. The continuous variable was the number of hearings in a case and in the majority of cases it was zero. Since it is not possible to incarcerate someone without a hearing, it probably messed up the process as it was quasi-separated. Majority of the warnings were finally solved using scaling and restarting the fit from the original value (n. 1 and 4 in examples in ?convergence - thanks for it!). – Jakub Drapal Jun 25 '16 at 10:54
  • @JakubDrapal , please post your comment as an answer ... (trying to go through and clear up unanswered `lme4` questions ...) – Ben Bolker Jul 12 '16 at 23:08

1 Answers1

7

After removing one of the continuous variable, it worked out. The continuous variable was the number of hearings in a case and in the majority of cases it was zero. Since it is not possible to incarcerate someone without a hearing, it probably messed up the process as it was quasi-separated. Majority of the warnings were finally solved using scaling and restarting the fit from the original value (n. 1 and 4 in examples in ?convergence - thanks for it!).

Jakub Drapal
  • 233
  • 2
  • 3
  • 8