0

I'm not sure why the coefficients are different between these two modules. They are both using the same training dataset I assumed they'd be the same but the model summary appears different, why?

 import statsmodels.api as sm
 # Test Models
 from sklearn.linear_model import LogisticRegression
 from sklearn.model_selection import train_test_split
 X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.25, random_state=0)
 logit_model = sm.Logit(y_train, X_train)  
 result = logit_model.fit()
 logreg = LogisticRegression()
 logreg.fit(X_train, y_train)
 print(logreg.coef_)
 print(result.summary2())
          



 
 Current function value: 0.180680
 Iterations 9
 [[ 0.3973997  -1.06979387 -0.68003664  0.14044959]]



                                  Results: Logit
 ==========================================================================================
 Model:              Logit                                       Pseudo R-squared: 0.737   
 Dependent Variable: Score                                       AIC:              591.2353
 Date:               2022-02-02 14:34                            BIC:              612.7812
 No. Observations:   1614                                        Log-Likelihood:   -291.62 
 Df Model:           3                                           LL-Null:          -1109.6 
 Df Residuals:       1610                                        LLR p-value:      0.0000  
 Converged:          1.0000                                      Scale:            1.0000  
 No. Iterations:     9.0000                                                                
 ---------------------------------------------------------------------------------------------
                           Coef.     Std.Err.       z        P>|z|      [0.025     0.975]
 ---------------------------------------------------------------------------------------------
          Var1              0.8863      0.0612     14.4869    0.0000     0.7664     1.0062
          Var2             -0.8040      0.0520    -15.4649    0.0000    -0.9058    -0.7021
          Var3             -1.1229      0.0819    -13.7037    0.0000    -1.2835    -0.9623
          Var4             -0.2160      0.0968    -2.2324     0.0256    -0.4056    -0.0264
 ==========================================================================================
sm11001
  • 3
  • 2
  • Does this answer your question? [Different coefficients: scikit-learn vs statsmodels (logistic regression)](https://stackoverflow.com/questions/50428825/different-coefficients-scikit-learn-vs-statsmodels-logistic-regression) – Chris Feb 02 '22 at 14:54
  • This doesn't make any difference to my results? – sm11001 Feb 02 '22 at 15:08

0 Answers0