Why does sklearn logistic regression regularize both the weights and the intercept?

Question

The regularization parameter C in logistic regression (see http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) is used allow the function to be fitted to be well defined and avoid either overfitting or problems with step functions (see https://datascience.stackexchange.com/questions/10805/does-scikit-learn-use-regularization-by-default/10806).

However, regularization in logistic regression should only concern the weights for the features, not the intercept (also explained here: http://aimotion.blogspot.com/2011/11/machine-learning-with-python-logistic.html)

But is seems that sklearn.linear_model.LogisticRegression actually regularizes the intercept as well. Here is why:

1) Conside above link carefully (https://datascience.stackexchange.com/questions/10805/does-scikit-learn-use-regularization-by-default/10806): the sigmod is moved slightly to the left, closer to the intercept 0.

2) I tried to fit data points with a logistic curve and a manual maximum likelihood function. Including the intercept into the L2 norm gives identical results as sklearn's function.

Two questions please:

1) Did I get this wrong, is this a bug, or is there a well-justified reason for regularizing the intercept?

2) Is there a way to use sklearn and specify to regularize all parameters except the intercepts?

Thanks!

import numpy as np
from sklearn.linear_model import LogisticRegression

C = 1e1
model = LogisticRegression(C=C)

x = np.arange(100, 110)
x = x[:, np.newaxis]
y = np.array([0]*5 + [1]*5)

print x
print y

model.fit(x, y)
a = model.coef_[0][0]
b = model.intercept_[0]

b_modified = -b/a                   # without regularization, b_modified should be 104.5 (as for C=1e10)

print "a, b:", a, -b/a

# OUTPUT: 
# [[100]
#  [101]
#  [102]
#  [103]
#  [104]
#  [105]
#  [106]
#  [107]
#  [108]
#  [109]]
# [0 0 0 0 0 1 1 1 1 1]
# a, b: 0.0116744221756 100.478968664

I asked a similar question several years ago. I know you can use the argument intercept_scaling to control this but I’m not sure of the appropriate technique. https://stackoverflow.com/questions/17711304/how-to-set-intercept-scaling-in-scikit-learn-logisticregression — cxrodgers, Nov 02 '17 at 05:13

score 2 · Accepted Answer · answered Nov 02 '17 at 12:20

2

scikit-learn has default regularized logistic regression.

The change in intercept_scaling parameter value in sklearn.linear_model.LogisticRegression has similar effect on the result if only C parameter is changed.

In case of modification in intercept_scaling parameter, regularization has an impact on the estimation of bias in logistic regression. When this parameter's value is on higher side then the regularization impact on bias is reduced. Per official documentation:

The intercept becomes intercept_scaling * synthetic_feature_weight.

Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased.

Hope it helps!

answered Nov 02 '17 at 12:20

Prem

11,775
1
19
33

I added a minimum example. Can you please illustrate with this example what needs to be added to not regularize the intercept? Thanks! – wolf Nov 03 '17 at 01:43
1

I think that if you make the intercept scaling arbitrarily large, the regularization of it will be negligible (but not strictly zero) – cxrodgers Nov 03 '17 at 16:52
As @cxrodgers has rightly pointed out - let's try `model = LogisticRegression(intercept_scaling=99999)`. – Prem Nov 03 '17 at 19:14

wolf · Answer 2 · 2017-11-04T01:36:49.343

1

Thanks @Prem, this is indeed the solution:

C = 1e1  
intercept_scaling=1e3    # very high numbers make it unstable in practice
model = LogisticRegression(C=C, intercept_scaling=intercept_scaling)

edited Nov 04 '17 at 01:36

answered Nov 04 '17 at 00:46

wolf

53
5

Why does sklearn logistic regression regularize both the weights and the intercept?

2 Answers2