I was trying to obtain the expected utility for each individual using R's survival
package (clogit
function) and I was not able to find a simple solution such as mlogit's logsum
.
Below I set an example of how one would do it using the mlogit
package. It is pretty straight forward: it just requires regressing the variables with the mlogit
function, save the output and use it as an argument in the logsum
function -- if needed, there is a short explanation in this vignette. And what I want is to know the similar method for clogit
. I've read the package's manual but I have failed to grasp what would be the most adequate function to perform the analsysis.
Note1: My preference for a function like mlogit's
is related to the fact that I might need to perform tons of regressions later on and being able to perform the correct estimation in different scenarios would be helpful.
Note2: I do not intend that the dataset created below be representative of how data should behave. I've set the example solely for the purpose of perfoming the function after the logit regressions.
**
library(survival)
library(mlogit)
#creating a dataset
df_test=data.frame(id=rep(1:20,each=4),
choice=rep(c("train","car","plane","boat")),
distance=c(rnorm(80)*10),
)
f=function(x,y,z) {
v=round(rnorm(x,y,z))
while(sum(v)>1 | sum(v)==0) {
v=round(rnorm(x,y,z))
}
return(v)
}
result1=c()
for (i in 1:20) {
result=f(4,0.5,0.1)
result1=c(result,result1)
}
df_test$distance=ifelse(df_test$distance<0,df_test$distance*-1,df_test$distance)
df_test$price = 0
df_test$price[df_test$choice=="plane"] = rnorm(20, mean = 300, sd=30)
df_test$price[df_test$choice=="car"] = rnorm(20, mean = 50, sd=10)
df_test$price[df_test$choice=="boat"] = rnorm(20, mean = 100, sd=15)
df_test$price[df_test$choice=="train"] = rnorm(20, mean = 120, sd=25)
df_test$choice2=result1
mlog=mlogit(choice2 ~ distance + price , data = df_test)
#the function logsum generates expected utility for each individual
logsum(mlog)
#so what would be adequate alternative with survival's clogit? I set an exemple below of
#of what i would like to regress and then perform something like logsum()
clog=clogit(choice2 ~ distance + price + as.factor(choice), strata(id), data = df_test)
**