I read this link and tried to change the reference category for the dependent variable when using statsmodels.formula.api
's glm(formula = "C(y,Treatment(reference=-1)) ~ x1 + x2", data=dta, family=sm.families.Binomial())
.
The dependent variable can only takes 2 valuesy={-1,1}
. I specified the reference category as above and even tried changing the reference category from -1 to 1 , yet the sign of the logistic regression coefficients is still the same. What did I do wrong here ?
It's also confusing that the logistic regression output does not tell whether an increase in x1
is having a negative impact on probability of -1
or 1
. Can someone help me out here please ?
Generalized Linear Model Regression Results
===============================================================================================================================================================
Dep. Variable: ["C(y, Treatment(reference=-1))[-1.0]", "C(y, Treatment(reference=-1))[1.0]"] No. Observations: 3311
Model: GLM Df Residuals: 3309
Model Family: Binomial Df Model: 1
Link Function: logit Scale: 1.0000
Method: IRLS Log-Likelihood: -2292.4
Date: Wed, 17 Nov 2021 Deviance: 4584.8
Time: 22:51:58 Pearson chi2: 3.31e+03
No. Iterations: 4
Covariance Type: nonrobust
====================================================================================================
coef std err z P>|z| [0.025 0.975]
----------------------------------------------------------------------------------------------------
x1 -0.1769 0.120 -1.473 0.141 -0.412 0.058
x2 0.2388 0.110 2.164 0.030 0.022 0.455
====================================================================================================