1

I am currently working on a behavior modelling project that involves estimating a multinomial logit model. After searching over the internet I came across the mnlogit package which seems very suitable for me.

The problem I am trying to model can be described as follows: A customer is offered 5 products from which he is to pick 1 or decide to not pick any. These products differ by price and delivery time. The prices and delivery time for these products are fixed across all customers. So, a customer can pick from 6 alternatives, 1, 2, 3, 4, 5 and 0. Alternative 1 represents product 1, while alternative 0 represents the option of not picking any product. Products 1 and 2 cost $1, products 3 and 4 cost $2, and product 5 also costs $1. Alternative 0, on the other hand, costs 0.

Raw Data

In order to simulate customer's decision I self-generated 7 parameters. I defined 'Price' as an alternative independent variable, meaning that all alternatives' price will have the same weight on the products utility. Besides, I defined 'Alternative' as an alternative specific variable, what yields to another 6 parameters. My goal was to simulate the attractiveness of a product due to its delivery time, since each alternative has a fixed delivery time. I calculated the utility of a product using the following expression:

product_utility = (B_alternative[ alternativeNum ] * alternativeNum) + (B_price * productPrice)

Where B_alternative is a vector of my alternatives parameters: [0, 0.6, 0.5, 0.45, 0.3, 0.3], with each index of this vector representing one alternative number (B_alternative[0] : parameter for alternative 0); And B_price is my price parameter: -0.5.

So, the utility I calculated for each product is : 0.00 ; 0.10 ; 0.50 ; 0.35 ; 0.20 ; 1.00 , being the first number the utility for alternative 0 and the last for product 5.

After calculating these utilities, I calculated the probability of a customer choosing the nth-product with the following expression:

Pn = exp(Un) / sum(exp(U))

Where 'sum(U)' is the sum of all utilities

And the probabilities (which adds up to 1) calculated were: 0.1097376 ; 0.1212788 ; 0.1809268 ; 0.1557251 ; 0.1340338 ; 0.2982978 , for each respective product from 0 to 5.

Using these probabilities and a random function, I generated a 'Mode' column in my table, representing the customer choice:

Data with choice column

Finally, following the documentation I found on CRAN, I made this code to estimate the model:

artificialData <- read.csv(PathToData, sep = ";")
# define model description (formula)
fm <- formula(MODE ~ PRICE - 1 | 1 | ALT)
# Define a mlogit data
TestData <- mlogit::mlogit.data(artificialData,
                                choice = "MODE", shape = "long",
                                alt.levels = c(1,2,3,4,5,0),
                                id.var = "CUSTOMER_ID")
# Estimate mnl
fit <- mnlogit::mnlogit(fm, TestData)
print(summary(fit))

However, no matter what parameters I set, I always get these two errors messages:

Error in solve.default(hessian, gradient, tol = 1e-24) : Lapack routine dgesv: system is exactly singular: U[7,7] = 0

or

In sqrt(diag(vcov(object))) : NaNs produced

Community
  • 1
  • 1
Guidotti
  • 13
  • 3

0 Answers0