3

The following is a minimum working example that generates the error. The following code worked in previous versions of mlogit but no longer works in version 1.1-0 (released May 26, 2020).

library(mlogit)
data("ModeCanada", package = "mlogit")
bususers <- with(ModeCanada, case[choice == 1 & alt == "bus"]) 
ModeCanada <- subset(ModeCanada, ! case %in% bususers) 
ModeCanada <- subset(ModeCanada, noalt == 4) 
ModeCanada <- subset(ModeCanada, alt != "bus") 
ModeCanada$alt <- ModeCanada$alt[drop = TRUE] 
KoppWen00 <- mlogit.data(ModeCanada, shape='long', 
    chid.var = 'case',alt.var = 'alt', choice = 'choice',            
    drop.index = TRUE, varying = 5:8) 

Upon executing the last line of code above the following error is generated: Error in dfidx::dfidx(data = data, dfa$idx, drop.index = dfa#drop.index for data in wide format, providing id2 is irrelevant. This error message is confusing in two ways. First, the code seems to be misinterpreting my data; the data is in long format (this is example "long" data that comes with the mlogit package and works into several of their examples) and I pass the function the argument "long", but the error says the data is being interpreted as "wide" and that is causing a problem. The second part of the error message tells me that id2 is irrelevant, but I don't know what id2 is; a search of the mlogit vignettes and package description for id2 yielded no results.

Notarobot2244
  • 61
  • 1
  • 6

2 Answers2

3

The issue seems to be introduced with the way dfidx is handling (or receiving) the data. By default, mlogit.data (which is a wrapper for dfidx in the most recent version of the mlogit package) is able to find the "varying" columns when the data is in long format. For example, with data on transportation choice, if individual i is choosing over transportation options j = 1, .., J, dfidx can tell that columns 5 through 8 are properties of the different transportation choices j (e.g. the cost of taking a car, train or bus) and not properties of the individual i (cf income in column 9 of ModeCanada, which is not "varying"). Thus, it seems when the "varying" argument is passed to mlogit.data, it forces the function to interpret the data as though it were in the "wide" format, despite even the user passing the function the argument "shape = long".

The solution is just to remove the argument "varying" from the function mlogit.data in this case, since mlogit.data can now determine the varying columns for itself when you pass it data in long format. That is, the following code will accomplish your goal.

library(mlogit)
data("ModeCanada", package = "mlogit")
bususers <- with(ModeCanada, case[choice == 1 & alt == "bus"]) 
ModeCanada <- subset(ModeCanada, ! case %in% bususers) 
ModeCanada <- subset(ModeCanada, noalt == 4) 
ModeCanada <- subset(ModeCanada, alt != "bus") 
ModeCanada$alt <- ModeCanada$alt[drop = TRUE] 
KoppWen00 <- mlogit.data(ModeCanada, shape='long', 
    chid.var = 'case',alt.var = 'alt', choice = 'choice',            
    drop.index = TRUE) 

Notarobot2244
  • 61
  • 1
  • 6
1

From the mlogit.data help page: "mlogit.data is deprecated, use dfidx::dfidx() instead". So I think mlogit.data calls dfidx. There the varying argument causes the error, because if this argument is set, dfdix assumes the data is in wide format.

Ahorn
  • 3,686
  • 1
  • 10
  • 17