0

I am trying to estimate a nested logit model of company siting choices, with nests = countries and alternatives = provinces, based on a number of alternative-specific characteristics as well as some company-specific characteristics. I formatted my data to a "long" structure using:

data <- mlogit.data(DB, choice="Occurrence", shape="long", chid.var="IDP", varying=6:ncol(DB), alt.var="Prov")

Here's a sample of the data:

     IDP         Occurrence From       Prov ToC Dist     Price     Yield
     5p1.APY 5p1      FALSE Sao Paulo  APY  PY 0.0000000 0.3698913 0.0000000
     5p1.BOQ 5p1      FALSE Sao Paulo  BOQ  PY 0.6495493 0.3698913 0.0000000
     5p1.CHA 5p1      FALSE Sao Paulo  CHA  AR 0.7870593 0.4622464 0.4461496
     5p1.COR 5p1      FALSE Sao Paulo  COR  AR 0.3747480 0.4622464 0.5536546
     5p1.FOR 5p1      FALSE Sao Paulo  FOR  AR 0.6822188 0.4622464 0.4402772
     5p1.JUY 5p1      FALSE Sao Paulo  JUY  AR 1.0000000 0.4622464 0.3617038

Note that I've reduced the table to a few variables for clarity but would normally use more.

The code I use for the nested logit is the following:

nests <- list(Bolivia="SCZ",Paraguay=c("PHY","BOQ","APY"),Argentina=c("CHA","COR","FOR","JUY","SAL","SFE","SDE"))

nml <- mlogit(Occurrence ~ DistComp + PriceComp + YieldComp, data=data, nests=nests, unscaled=T)
summary(nml)

When running this model, I get the following output:

> summary(nml)

Call:
mlogit(formula = Occurrence ~ DistComp + PriceComp + YieldComp, 
    data = data, nests = nests, unscaled = T)

Frequencies of alternatives:
      APY       BOQ       CHA       COR       FOR       JUY       PHY       
SAL       SCZ       SDE       SFE 
0.1000000 0.0666667 0.1333333 0.0250000 0.0750000 0.0083333 0.0083333 
0.1166667 0.2583333 0.1750000 0.0333333 

bfgs method
1 iterations, 0h:0m:0s 
g'(-H)^-1g = 1E+10 
last step couldn't find higher value 

Coefficients :
                Estimate Std. Error t-value Pr(>|t|)
BOQ:(intercept) -0.29923         NA      NA       NA
CHA:(intercept) -1.25406         NA      NA       NA
COR:(intercept) -1.76020         NA      NA       NA
FOR:(intercept) -1.97083         NA      NA       NA
JUY:(intercept) -4.14476         NA      NA       NA
PHY:(intercept) -2.63961         NA      NA       NA
SAL:(intercept) -1.72047         NA      NA       NA
SCZ:(intercept) -0.15714         NA      NA       NA
SDE:(intercept) -0.57449         NA      NA       NA
SFE:(intercept) -2.47345         NA      NA       NA
DistComp         2.44322         NA      NA       NA
PriceComp        2.45202         NA      NA       NA
YieldComp        3.15611         NA      NA       NA
iv.Bolivia       1.00000         NA      NA       NA
iv.Paraguay      1.00000         NA      NA       NA
iv.Argentina     1.00000         NA      NA       NA

Log-Likelihood: -221.84
McFadden R^2:  0.10453 
Likelihood ratio test : chisq = 51.79 (p.value = 2.0552e-09)

I don't understand what causes the NAs in the output, considering that I prepared the data using mlogit.data(). Any help on this would be greatly appreciated.

Best,

Yann

  • Your issue with NA's in the output is not related to needing to run the model with unscaled=TRUE. That is required because you have one "degenerate" nest with only a single alternative -- only a single province in Bolivia. See the example on page 35 of the vignette provided in package mlogit. Have you tried running the model with fewer covariates to see if it will work with a simpler model? – atiretoo Oct 01 '15 at 19:56
  • Thanks for your comment, Shami. You're right, I was confused about the `unscaled=T` option, I edited my question and removed that part. I have tried the model with as little as one variable and it won't run, so it doesn't seem to be about the number of covariates. Imposing unique elasticity (`un.nest.el=T`) or removing intercepts (`-1`) won't work either. Thanks – Yann le Polain Oct 01 '15 at 23:18
  • Sorry I meant thanks @atiretoo, Shami edited some of the contents of the post but wasn't responsible for the comment, my mistake :) – Yann le Polain Oct 02 '15 at 14:59

0 Answers0