1

I have the following dataframe X:

chid.var    id.var  alt.var wei odd cou cla pla
1           1       1       130 6.6 0   5   1
1           1       2       119 224 0   5   0
1           1       3       126 10  0   5   0
1           1       4       128 66  0   5   0
1           1       5       131 8.3 0   5   0
2           1       1       130 15  0   4   1
2           1       2       124 6.9 0   4   0
2           1       3       119 156 0   4   0
2           1       4       122 39  0   4   0
2           1       5       124 2   0   4   0
3           1       1       121 5.3 0   1   1
3           1       2       112 22  0   1   0
3           1       3       119 7.2 0   1   0
3           1       4       119 5.9 0   1   0
3           1       5       133 1.7 0   1   0
3           1       6       113 72  0   1   0
3           1       7       113 12  0   1   0
4           1       1       121 13  0   4   1
4           1       2       125 6   0   4   0
4           1       3       127 6.3 0   4   0

where there is only 1 decision maker/ individual, indicated by 1 in id.var, a varying choice set from 1 to 7 in alt.var and choice situation in chid.var. wei and cou are both alternative specific variables while cou and cla are choice situation specific variables and finally pla is the dependent variable (the choice).

I tried to use the mlogit package to model the choice using the probit model:

df <- mlogit.data(data=X,
                  choice = "Place",
                  shape = "long",
                  chid.var = "chid.var",
                  id.var = "id.var",
                  alt.var = "alt.var")

model <- mlogit(pla ~ wei + odd | cou + cla, data = df, probit = TRUE)

However I get the "out of bound error": Error in As[[pos[i, j]]] : subscript out of bounds

I tried to cut the dataframe smaller to only the first two choice situations (so that the choice set is the same from 1 to 5 for both choice situations):

chid.var    id.var  alt.var wei odd cou cla pla
1           1       1       130 6.6 0   5   1
1           1       2       119 224 0   5   0
1           1       3       126 10  0   5   0
1           1       4       128 66  0   5   0
1           1       5       131 8.3 0   5   0
2           1       1       130 15  0   4   1
2           1       2       124 6.9 0   4   0
2           1       3       119 156 0   4   0
2           1       4       122 39  0   4   0
2           1       5       124 2   0   4   0

and I run the same code again:

X <- X[-c(11:20),]
df <- mlogit.data(data=X,
                  choice = "Place",
                  shape = "long",
                  chid.var = "chid.var",
                  id.var = "id.var",
                  alt.var = "alt.var")

model <- mlogit(pla ~ wei + odd | cou + cla, data = df, probit = TRUE)

and this time I get the "system is computationally singular" error: Error in solve.default(H, g[!fixed]) : system is computationally singular: reciprocal condition number = 9.15665e-23

I have looked into different questions on stackoverflow but none of them seemed relevant, please help and thanks in advance.

Ishigami
  • 181
  • 7
  • Is this all your data? The problem is that your data is singular, i.e. the determinant is equal to zero. Check the things you can do in this post: https://stackoverflow.com/questions/58080637/mlogit-error-error-in-solve-defaulth-gfixed-lapack-routine-dgesv-syste – Quinten Jun 05 '22 at 17:09
  • @Quinten No this is not all my data, but even when I run the model with this data it still says model system is computationally singular but the rows of the data above are all distinct and hence the determinant shouldn't be singular. – Ishigami Jun 06 '22 at 08:41
  • Is it possible to share your complete data using `dput`? – Quinten Jun 06 '22 at 08:42
  • @Quniten Here is the complete data: https://drive.google.com/file/d/1uA6O8Fp2N4WHZZN9pOHKzRefXJ2tkMHC/view?usp=sharing and my code is df <- mlogit.data(data=X, choice="Choice", shape="long", chid.var = "chid.var", id.var = "id.var", alt.var = "alt.var") model <- mlogit(Choice ~ Weight + Draw + Age | Course + Class + Distance | 0, data = df, probit = TRUE) – Ishigami Jun 06 '22 at 11:39

1 Answers1

0

Try setting the intercept to 0. I ran into this issue with panel data before and that fixed it.

model <- mlogit(0 + pla ~ wei + odd | cou + cla, data = df, probit = TRUE)

Another possible solution is that one variable is much bigger than the others. Perhaps try transforming the variables by log or dividing the larger variable by multiples of ten and then adjust your coefficient interpretations accordingly. eg:

x$weiTRANSFORM <- x$wei / 10

Yet another approach would be that there is significant collinearity between the variables. Your feature selection may need to be refined. Try lasso regression or stepwise to see which variables are most suitable. Hope this helps.

robbieNukes
  • 89
  • 1
  • 12