2

I have a little problem which I am stuck with. I am building a multinomial logit model with Python statsmodels and wish to reproduce an example given in a textbook. So far so good, but I am struggling with setting a different target value as the base value for the regression. Can somebody help?!

import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt

#import data
df = pd.read_excel('C:/.../diabetes.xlsx')

#split the data in dependent and independent variables
y = df['CC']
X = df.drop(['Patient', 'CC'], axis = 1)
Xc = sm.add_constant(X)

#instantiate and fit multinomial logit
mlogit = sm.MNLogit(y, Xc)
fmlogit = mlogit.fit()

print(fmlogit.summary())

So the column 'CC' is the target variable and contains enconding for the diabetes status:

CC = 1 -> Overt diabetes, CC = 2 -> Chemical dibetes, CC = 3 -> Normal

Now, per default CC = 1 is the base value, however, I would like CC = 3 to be my base value. Here is my regression output.

Does anybody know?

Thanks a lot in advance, ig

T1B
  • 21
  • 1
  • 4
  • It looks like this is currently not supported. The workaround is to rename the reference label so it comes first in alphanumerical order. – Josef May 21 '17 at 12:34
  • Okay, I was hoping there is a more elegant way. Alright, many thanks for your reply. – T1B May 22 '17 at 19:12

0 Answers0