2

I'm calcualting the weights for a linear regression with weight-decay, i.e. normally I am trying to find beta = (X'X + lambda I)^-1 X'Y where X has n rows of D features each and Y is a vector of outputs for each row of X.

I've been fitting without a bias term by using:

def wd_fit(A, y, lamb=0):
    n_col = A.shape[1]
    return np.linalg.lstsq(A.T.dot(A) + lamb * np.identity(n_col), A.T.dot(y))

I'd like to also calculate a bias or intercept term for the fit, instead of having it pass through the origin. I'd like to keep the same call to lstsq, so if there's some matrix transform I can carry out, that would be ideal. My inclination is to append column of 1s somewhere, so that X_mod say would then have D+1 features where the last relates to the intercept value, but I'm not quite sure where that should be or even if it's correct.

Jaxter
  • 23
  • 3

1 Answers1

0

If you don't want to mean-center your variables, adding a column of ones will work and is a perfectly acceptable solution.

The bias term will just be the coefficient at the position of the added column.

cangrejo
  • 2,189
  • 3
  • 24
  • 31