I'm not sure why the coefficients are different between these two modules. They are both using the same training dataset I assumed they'd be the same but the model summary appears different, why?
import statsmodels.api as sm
# Test Models
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.25, random_state=0)
logit_model = sm.Logit(y_train, X_train)
result = logit_model.fit()
logreg = LogisticRegression()
logreg.fit(X_train, y_train)
print(logreg.coef_)
print(result.summary2())
Current function value: 0.180680
Iterations 9
[[ 0.3973997 -1.06979387 -0.68003664 0.14044959]]
Results: Logit
==========================================================================================
Model: Logit Pseudo R-squared: 0.737
Dependent Variable: Score AIC: 591.2353
Date: 2022-02-02 14:34 BIC: 612.7812
No. Observations: 1614 Log-Likelihood: -291.62
Df Model: 3 LL-Null: -1109.6
Df Residuals: 1610 LLR p-value: 0.0000
Converged: 1.0000 Scale: 1.0000
No. Iterations: 9.0000
---------------------------------------------------------------------------------------------
Coef. Std.Err. z P>|z| [0.025 0.975]
---------------------------------------------------------------------------------------------
Var1 0.8863 0.0612 14.4869 0.0000 0.7664 1.0062
Var2 -0.8040 0.0520 -15.4649 0.0000 -0.9058 -0.7021
Var3 -1.1229 0.0819 -13.7037 0.0000 -1.2835 -0.9623
Var4 -0.2160 0.0968 -2.2324 0.0256 -0.4056 -0.0264
==========================================================================================