-1

I am trying to using the .score() method on a fitted Linear Regressor but I am getting an error.

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn import metrics
from sklearn.metrics import mean_squared_error

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, 
random_state = 104)
reg = LinearRegression()
reg.fit(X_train, y_train)
y_pred = reg.predict(X_test)
print("R^2: {}".format(reg.score(X_test, y_test)))
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
print("Root Mean Squared Error: {}".format(rmse))
reg.score(y_test.reshape(-1,1), y_pred.reshape(-1,1))

ValueError: shapes (10719,1) and (16,1) not aligned: 1 (dim 1) != 16 (dim 0)

I should mention that I have already tried to reshape y_pred and y_test so that they match but it still does not work. I am not sure why the error says (16,1); what are these the dimensions for? I have tried searching for similar questions such as this one: Error using sklearn and linear regression: shapes (1,16) and (1,1) not aligned: 16 (dim 1) != 1 (dim 0) but I am still confused.

Edit: Here is the output for the shapes:

print(X_test.shape, y_test.shape, y_pred.shape)

(10719, 16) (10719, 1) (10719, 1)
  • 1
    It looks like y_test is somehow 16 rows of data. Could you `print(X_test.shape, y_test.shape, y_pred.shape)` – Sam Shleifer Aug 22 '18 at 03:15
  • @SamShleifer Yes, I ended up getting (10719,16) for X_test and (10719,1) for both y_test and y_pred – KoreanInvestor Aug 22 '18 at 03:29
  • 1
    Also, including some minimal test data would help. – Sam Shleifer Aug 22 '18 at 03:32
  • 2
    try changing the last line to `reg.score(y_test.reshape(-1,), y_pred.reshape(-1,))`? – Sam Shleifer Aug 22 '18 at 03:33
  • 1
    Please post the complete stack trace of error along with the version of scikit you are using. With your given code and shapes I am not able to reproduce on my data on `scikit 0.19.2`. – Vivek Kumar Aug 22 '18 at 04:40
  • 1
    By the way this :- `reg.score(y_test.reshape(-1,1), y_pred.reshape(-1,1))` is wrong. You cannot use `y_test` and `y_pred` in `reg.score()`. It requires a feature matrix and correct class labels as you are doing correctly here `reg.score(X_test, y_test)`. – Vivek Kumar Aug 22 '18 at 06:03

1 Answers1

2

From the scikit docs, score(X, y, sample_weight=None), so you don't send it the predictions as the first arguments. Instead, you send the features.

Therefore, the last line should be print(reg.score(X_test, y_test))

Sam Shleifer
  • 1,716
  • 2
  • 18
  • 29