Predict() with nested multinomial logit models

Question

I'm using the mlogit package in R to create a nested multinomial logit model of healthcare provider choice given choice data I have. The data look like this:

          ID   RES    weight age  wealth    educ married urban partnerAge totalChildren survivingChildren anyANC
1.0  2468158 FALSE 0.2609153  29    Poor Primary       1     0         31             4                 4      1
1.1  2468158  TRUE 0.2609153  29    Poor Primary       1     0         31             4                 4      1
1.2  2468158 FALSE 0.2609153  29    Poor Primary       1     0         31             4                 4      1
1.3  2468158 FALSE 0.2609153  29    Poor Primary       1     0         31             4                 4      1
2.0 14233860 FALSE 0.2754970  19 Poorest Primary       1     0         30             1                 1      1
2.1 14233860  TRUE 0.2754970  19 Poorest Primary       1     0         30             1                 1      1
2.2 14233860 FALSE 0.2754970  19 Poorest Primary       1     0         30             1                 1      1
2.3 14233860 FALSE 0.2754970  19 Poorest Primary       1     0         30             1                 1      1
    outlier50Km optout alt               spa     mes   dist      bobs  cobs    Q fees chid educSec
1.0           0     -1   0              Home    Home  0.000 0.0000000 0.000 0.00    0    1       0
1.1           0     -1   1     Health center  Public 13.167 0.4898990    NA 0.64    0    1       0
1.2           0     -1   2     Health center  Public 30.596 0.5202020    NA 0.56    0    1       0
1.3           0     -1   3 District hospital  Public 41.164 0.7171717 0.825 0.88    0    1       0
2.0           0     -1   0              Home    Home  0.000 0.0000000 0.000 0.00    0    2       0
2.1           0     -1   1     Health center Mission 14.756 0.7676768    NA 0.64    1    2       0
2.2           0     -1   2     Health center  Public 41.817 0.3787879    NA 0.56    0    2       0
2.3           0     -1   3 District hospital  Public 50.419 0.7171717 0.825 0.88    0    2       0

where spa, mes, dist, bobs, cobs, Q, and fees are characteristics of the provider and the remaining variables specific to the individual. These data are in long format, meaning each individual has four rows, reflecting her four choices (alt = 0:3), with RES being the response variable.

An un-nested model behaves appropriately

f.full <- RES ~ 0 + dist + Q + bobs + fees + spa | 0 + age + wealth + educSec + married + urban + totalChildren + survivingChildren
choice.ml.full <- mlogit(formula = f.full, data = data, weights = weight)
predict(choice.ml.full, data[1:8,])

             0         1          2          3
[1,] 0.1124429 0.7739403 0.06893341 0.04468343
[2,] 0.4465272 0.3107375 0.11490317 0.12783210

By all measures of model fit, however, a nested model is better than an un-nested one. The nested model gives me coefficients appropriately:

ns2 <- mlogit(formula = f.full, nests = list(home = "0", useCare = c("1", "2", "3")), data = data, weight = weight, un.nest.el = TRUE)
summary(ns2)
Call:
mlogit(formula = f.full, data = data, weights = weight, nests = list(home = "0", 
    useCare = c("1", "2", "3")), un.nest.el = TRUE)

Frequencies of alternatives:
       0        1        2        3 
0.094378 0.614216 0.194327 0.097079 

bfgs method
23 iterations, 0h:0m:13s 
g'(-H)^-1g = 9.51E-07 
gradient close to zero 

Coefficients :
                        Estimate Std. Error t-value  Pr(>|t|)    
dist                  -0.0336233  0.0040136 -8.3773 < 2.2e-16 ***
Q                      0.1780058  0.0768181  2.3172 0.0204907 *  
bobs                  -0.0695695  0.0505795 -1.3754 0.1689925    
fees                  -0.8488132  0.1001928 -8.4718 < 2.2e-16 ***
etc...

But, I get the following error if I try to predict on a single individual:

predict(ns2, data[1:4,])
Error in apply(Gl, 1, sum) : dim(X) must have a positive length

and a different error if I try to predict on more than one individual:

predict(ns2, data[1:8,])
Error in solve.default(crossprod(attr(x, "gradi")[, !fixed])) : 
  Lapack routine dgesv: system is exactly singular: U[5,5] = 0

Any help would be vastly appreciated.

You need to include all the variables into the `newdata` argument of `predict` as used by the `f.full`. — Roman Luštrik, Jun 15 '17 at 14:13
@RomanLuštrik, all the data are there. `data` is simply a subset of my original dataset. — TheChainsOfMarkov, Jun 15 '17 at 18:40

Predict() with nested multinomial logit models

0 Answers0