5

I want to estimate a nested logit model using the language R. As standard packages for such problems I am using "mlogit". Now I would like to estimate a model, with more than just one stage. The problem is as follows:

  1. Stage: People decide, if they want to migrate to the US or not.
  2. Stage: For all people decided to migrate, they have to decide in which region of the US they want to go (US devided in 6 regions)
  3. Stage: Within the region, they decide in what kind of area they want to live; urba vs. rural

I already transformed my data using mlogit.data().

data <- mlogit.data(data = usa_canada_uk, choice = "migrant"))

This is how "data" looks:

                sex   marst numperhh_cat age_cat   famsize                               nchlt5 labour_code sample                               nchlt10
3888.no  female married          1-5     16+         1          no chiled aged 5 or younger not working   8262          no chiled aged 10 or younger
12874.no female married          1-5     16+ 2 or more at least one child aged 5 or younger   ancillary   8262 at least one child aged 10 or younger
13084.no female married          1-5     16+ 2 or more at least one child aged 5 or younger   ancillary   8262 at least one child aged 10 or younger
9359.yes female married          1-5     16+         1          no chiled aged 5 or younger     service   8262          no chiled aged 10 or younger
7569.no  female married          1-5     16+         1          no chiled aged 5 or younger     service   8262          no chiled aged 10 or younger
5778.no  female married          1-5     16+         1          no chiled aged 5 or younger not working   8262          no chiled aged 10 or younger
         perwt        labforce age migrant country_of_birth       region citypop urban work.prob.home work.prob.abroad migration.prob stay.prob  chid
3888.no      1     in labforce  26    TRUE   United Kingdom home_country      NA urban             NA               NA             NA        NA  3888
12874.no     1     in labforce  47    TRUE   United Kingdom home_country      NA rural             NA               NA             NA        NA 12874
13084.no     1     in labforce  22    TRUE   United Kingdom home_country      NA urban             NA               NA             NA        NA 13084
9359.yes     1     in labforce  28   FALSE   United Kingdom home_country      NA urban             NA               NA             NA        NA  9359
7569.no      1     in labforce  32    TRUE   United Kingdom home_country      NA urban             NA               NA             NA        NA  7569
5778.no      1 not in labforce  38    TRUE   United Kingdom home_country      NA rural             NA               NA             NA        NA  5778
         alt
3888.no   no
12874.no  no
13084.no  no
9359.yes yes
7569.no   no
5778.no   no

Here is my idea, how I want to code it, but it is not working:

mlog <- mlogit(migrant ~ 1  | age + numperhh_cat + sex + famsize + work.prob.home,
             nests = list(home = c("home_country"),
                         foreign = c(region_1 = c("rural", "urban"),
                                     region_2 = c("rural", "urban"),
                                     region_3 = c("rural", "urban"),
                                     region_4 = c("rural", "urban"),
                                     region_5 = c("rural", "urban"),
                                     region_6 = c("rural", "urban"))
                                     ),
           reflevel = "yes",
           weights = perwt,
           data = data)

As you can see, one nest on the first stage (deciding not to migrate = "home_country") is degenerated.

If someone could help me, that would be awesome.

Best wishes,

Chris

ResearchR
  • 91
  • 4
  • Can you provide the full `usa_canada_uk` data set, perhaps by using `dput()`, so that your problem is fully reproducible? – davechilders Jul 05 '15 at 14:56
  • The problem is that the data in its original shape has more than 4 mio rows. And this is just a small extract. The complete file 25 times bigger --> Server. I will give it a try.... – ResearchR Jul 05 '15 at 15:06
  • I just created the file with dput...I think it is way to big to post it here. Are there any other options that I can make my problem reproducible? – ResearchR Jul 05 '15 at 15:15
  • Using `saveRDS()`, you could save your data as an rds file and host it on a webpage. – davechilders Jul 05 '15 at 15:28
  • If you're going to put it on Dropbox, I think it would be easier to just save the data file as an rds. – davechilders Jul 05 '15 at 17:00
  • Done: https://dl.dropboxusercontent.com/u/66405775/usa_canda_uk.rds – ResearchR Jul 05 '15 at 17:23
  • Hello Chris, were you able to solve this problem? I am also trying to build a multistage nested logit model using mlogit package. I am new to this package and I've just started exploring. Can anyone explain how the model would be defined using mlogit function and also how the data needs to be formatted for the model? – Terminator17 Apr 02 '19 at 13:20
  • @Terminator17: Sorry, I never managed to implement a multilevel nested mlogit model. – ResearchR Apr 05 '19 at 12:49
  • Thank you so much for the reply @chris-a. I will try to let you know in case I am able to implement it. or make any progress. That way others can be benefitted. – Terminator17 Apr 05 '19 at 14:48

0 Answers0