I am implementing a multinomial logit model using the mlogit package in R. The data includes three different "choices" and three variables (A, B, C) which contains information for the independent variable. I have transformed the data into a wide format using the mlogit.data function which makes it look like this:
Observation Choice VariableA VariableB VariableC
1 1 1.27 0.2 0.81
1 0 1.27 0.2 0.81
1 -1 1.27 0.2 0.81
2 1 0.20 0.45 0.70
2 0 0.20 0.45 0.70
2 -1 0.20 0.45 0.70
The thing is that I want the independent variable to be choice-specific and therefore being constructed as Variable D below:
Observation Choice VariableA VariableB VariableC VariableD
1 1 1.27 0.2 0.81 1.27
1 0 1.27 0.2 0.81 0.2
1 -1 1.27 0.2 0.81 0.81
2 1 0.20 0.45 0.70 0.20
2 0 0.20 0.45 0.70 0.45
2 -1 0.20 0.45 0.70 0.70
Variable D was constructed using the following code:
choice_map <- data.frame(choice = c(1, 0, -1), var = grep('Variable[A-C]', names(df)))
df$VariableD <- df[cbind(seq_len(nrow(df)), with(choice_map, var[match(df$Choice, choice)]))]
However, when I try to run the multinomial logit model,
mlog <- mlogit(Choice ~ 1 | VariableD, data=df, reflevel = "0")
the error message "row names supplied are of the wrong length" is returned. When I use any of the other variables A-C separately the regression is run without any problems, so my questions are therefore: why can't Variable D be used and how can this problem be solved?
Thanks!