Why is the bias term not regularized in ridge regression?

Question

In most of classifications (e.g. logistic / linear regression) the bias term is ignored while regularizing. Will we get better classification if we don't regularize the bias term?

Def_Os · Accepted Answer · 2020-08-31T20:38:07.277

15

Example:

Y = aX + b

Regularization is based on the idea that overfitting on Y is caused by a being "overly specific", so to speak, which usually manifests itself by large values of a's elements.

b merely offsets the relationship and its scale therefore is far less important to this problem. Moreover, in case a large offset is needed for whatever reason, regularizing it will prevent finding the correct relationship.

So the answer lies in this: in Y = aX + b, a is multiplied with the explanatory/independent variable, b is added to it.

edited Aug 31 '20 at 20:38

answered Sep 25 '12 at 08:13

Def_Os

5,301
5
34
63

Why do you call `X` "explaining variable"? Is there some reference? thanks. – CyberPlayerOne Jun 25 '18 at 05:43
@Tyler提督九门步军巡捕五营统领, more commonly `X` would be referred to as the ["dependent variable"](https://en.wikipedia.org/wiki/Dependent_and_independent_variables). – Def_Os Jun 28 '18 at 21:00
1

@Def_Os, no, in this terminology `X` would be the _independent_ variable, and `Y` is the dependent one (`Y` depends on `X`). In response to @Tyler's question, the linked article mentions "explanatory variable" as a synonym for independent variable. – wjakobw Oct 18 '18 at 13:58

Why is the bias term not regularized in ridge regression?

1 Answers1