Have run into a problem fitting a binomial logistic regression, in that the results seem to be suspect between languages. Having spent an extended period looking into this and looking for online suggestions, (tried all data variations just in case as well), I believe it comes down to what fitting procedure MATLAB is using for glmfit
(I have a sneaking suspicion its a Maximum Likelihood Estimator, whereas python and R use IRLS/IWLS.)
I first ran my problem in MATLAB using:
[b_lr,dev,stats] = glmfit(x',y','binomial','link','logit');
Where x'
is a multi-column array with predictors and row length = y
, and y
is a response vector with a binary result based on the criterion.
Since that calculation I've moved to using python/R2py. I tried the same procedure in both Python and R for fitting a logit linked binomial using the equivalent of glmfit from statsmodels and got a different set of coefficients for the regression (note that the position of the response vector changes for these two):
glm_logit = sm.GLM(yvec.T,Xmat,family = sm.families.Binomial()).fit()
and using R2py:
%R glm.out = glm(Data ~ ONI + Percentiles, family=binomial(logit), data=df)
Would appreciate if someone could clarify what MATLAB uses, and if anyone had suggestions for how to replicate the MATLAB result in python or R.