I'm doing logistic regression in Python with this example from wikipedia. link to example
here's the code I have:
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression()
Z = [[0.5], [0.75], [1.0], [1.25], [1.5], [1.75], [1.75], [2.0], [2.25], [2.5], [2.75], [3.0], [3.25], [3.5], [4.0], [4.25], [4.5], [4.75], [5.0], [5.5]] # number of hours spent studying
y = [0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1] # 0=failed, 1=pass
lr.fit(Z,y)
results for this are
lr.coef_
array([[ 0.61126347]])
lr.intercept_
array([-1.36550178])
while they get values 1.5046 for hour coefficient and -4.0777 intercept. why are the results so different? their prediction for 1 hour of study is probability 0.07 to pass, while i get 0.32 with this model, these are drastically different results.