2

When I do ridge regression using sklearn in Python, the coef_ output gives me a 2D array. According to the documentation it is (n_targets, n_features).

I understand that features are my coefficients. However, I am not sure what targets are. What is this?

django_noob
  • 131
  • 1
  • 1
  • 5

1 Answers1

7

The targets are the values you want to predict. The ridge regression can in fact predict more values for each instance, not only one. The coef_ contain the coefficients for the prediction of each of the targets. It is also the same as if you trained a model to predict each of the targets separately.

Let's have a look at a simple example. I will use LinearRegression instead of Ridge, as Ridge shrinks the values of the coefficients and make it harder to understand.

First, we create some random data:

X = np.random.uniform(size=100).reshape(50, 2)
y = np.dot(X, [[1, 2, 3], [3, 4, 5]])

The first three instances in X are:

[[ 0.70335619  0.42612165]
 [ 0.2959883   0.10571314]
 [ 0.33868804  0.07351525]]

The targets y for these instances are

[[ 1.98172114  3.11119897  4.24067681]
 [ 0.61312771  1.01482915  1.41653058]
 [ 0.55923378  0.97143708  1.38364037]]

Notice, that y[0] = x[0]+3*x[1], y[1] = 2*x[0] + 4*x[1] and y[2] = 3*x[0] + 5*x[1] (that's how we created the data with the matrix multiplication).

If we now fit the linear regression model

clf = linear_model.LinearRegression()
clf.fit(X, y) 

the coef_s are:

[[ 1.  3.]
 [ 2.  4.]
 [ 3.  5.]]

This exactly matches the equations we used to create the data.

  • So we can call this `coef_` as weights of the model? – Ahmad Anis Jun 07 '20 at 15:29
  • 3
    Yes, you could call them "weights", although this term is mainly used in machine learning and with relation to neural networks. In linear regression, they are commonly called "coefficients", "effects" or just generally "parameters". – Martin Pilát Jul 16 '20 at 00:07