0

I am trying to use this data set https://data.cityofnewyork.us/Transportation/Citywide-Mobility-Survey-Person-Survey-2019/6bqn-qdwq to create an mnl model but every time I try to change my original data frame like this

nydata_df = dfidx(nydata, shape="wide",choice="work_mode",varying = sort)

I get this error here.

Error in names(data)[ix] : invalid subscript type 'language'

I'm unclear about what is causing this error I think it is something wrong with dplyr but I am not sure.

1 Answers1

0

According to this vignette from the mlogit package, the varying argument should be used to specify which variables should be "lengthened" when converting a dataframe from wide to long using dfidx. Are you actively trying to lengthen your dataframe (like in the style of dplyr::pivot_longer())?

If you aren't, I don't believe that you need the varying argument (see ?stats::reshape for more info on varying). If you want to use the varying argument, you should specify specific variables rather than only "sort" (example1, example2). Additionally, when I run your models, I don't get a NaN for McFadden's R2, p-value, or chi-square test. Are your packages fully updated?

library(dfidx)
library(mlogit)
library(performance) # to extract McFadden's R2 easily

packageVersion("dfidx")
#> [1] '0.0.5'
packageVersion("mlogit")
#> [1] '1.1.1'
packageVersion("dplyr")
#> [1] '1.0.10'
# currently running RStudio Version 2022.7.2.576

nydata <- read.csv(url("https://data.cityofnewyork.us/api/views/6bqn-qdwq/rows.csv?accessType=DOWNLOAD"))
nydata_df <- dfidx(data = nydata, 
                   shape = "wide",
                   choice = "work_mode")

m <- mlogit(work_mode ~ 1, nydata_df)
#summary(m)
r2_mcfadden(m)
#> McFadden's R2 
#>  1.110223e-16
m3 <- mlogit(work_mode ~ 1 | harassment_mode + age, nydata_df)
#summary(m3)
r2_mcfadden(m3)
#> McFadden's R2 
#>    0.03410362
jrcalabrese
  • 2,184
  • 3
  • 10
  • 30
  • No because when I run an mlogit model both the McFadden R^2 number is NAN as is the chi squared test and the pvalue which indicates an error and that is for a variety of models I have tried. Basically I can run a model but something gets messed up in the data so the model is meaningless. – Thomas E.M. Schlesinger Nov 21 '22 at 06:19
  • Can you post some of your model code? – jrcalabrese Nov 21 '22 at 12:37
  • Sure m3<-mlogit(work_mode ~ 1 | harassment_mode +age,nydata_df) is a simple example of one I tried but it doesn't really matter even if you make it m <-mlogit(work_mode ~ 1 ,nydata_df) you still run into the NaN problem for McFadden R^2 , chi squared test and p-value and that is true if you try to make the model more complex as well. The problem is not the model the problem is the dataframe. – Thomas E.M. Schlesinger Nov 21 '22 at 23:33
  • I've updated the answer; I replicated your models, but I didn't get `NaN` for p, chi-square, or R2. – jrcalabrese Nov 22 '22 at 16:22