-1

I've ran the following lines of code

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

from sklearn.datasets import load_boston
boston = load_boston()
print(boston.data.shape) 

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

x = pd.DataFrame(boston.data)
x.columns = boston.feature_names
y=pd.DataFrame(boston.target)
y.columns=['TARGET']

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3,  random_state=101)

model = LinearRegression()

model.fit(x_train,y_train)


print('Coefficients: \n', model.coef_)
len(model.coef_)

Coefficients: 
 [[-8.74917163e-02  5.02793747e-02  2.06785359e-02  3.75457604e+00
  -1.77933846e+01  3.24118660e+00  1.20902568e-02 -1.40965453e+00
   2.63476633e-01 -1.03376395e-02 -9.52633123e-01  6.20783942e-03
  -5.97955998e-01]]
1


coeffecients = pd.DataFrame(data=model.coef_,index=x.columns,columns=['Coefficient'])

error msg: Shape of passed values is (13, 1), indices imply (1, 13)

I think the issue is from the length for the array of coefficients being 1. Not sure though.

Vivek Kumar
  • 35,217
  • 8
  • 109
  • 132
Ian M.
  • 1
  • 2
  • Which version of scikit-learn are you using? I am not able to duplcate the issue on `0.19.2`. Its already in the form the answer below has converted to. – Vivek Kumar Aug 10 '18 at 11:29

1 Answers1

1

IMO, this happens because your y_train is a 2d DataFrame with shape (n_samples, 1).

coef_ : array, shape (n_features, ) or (n_targets, n_features)

Estimated coefficients for the linear regression problem. If multiple targets are passed during the fit (y 2D), this is a 2D array of shape (n_targets, n_features), while if only one target is passed, this is a 1D array of length n_features.

Pass np.ravel(y_train) instead or just use y = pd.Series(boston.target) may fix this.

Sacry
  • 535
  • 4
  • 9