I used the following block of code and I got a traceback error;
Code (in the code below, X_train and y_train are data series (a single column of data)):
from sklearn.linear_model import LinearRegression
regressor = LinearRegression(fit_intercept=True)
regressor.fit(X_train, y_train)
Error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-167-3392c2ad36e2> in <module>
2 from sklearn.linear_model import LinearRegression
3 regressor = LinearRegression(fit_intercept=True)#Instantiating an object of the LinearRegression class.#"fit_intercept = True" is asking the linear regressor to assume that there is a y-intercept.
----> 4 regressor.fit(X_train, y_train) #Passing in our training data
~\Anaconda3\lib\site-packages\sklearn\linear_model\base.py in fit(self, X, y, sample_weight)
461 n_jobs_ = self.n_jobs
462 X, y = check_X_y(X, y, accept_sparse=['csr', 'csc', 'coo'],
--> 463 y_numeric=True, multi_output=True)
464
465 if sample_weight is not None and np.atleast_1d(sample_weight).ndim > 1:
The code works after I changed X_train and y_train to dataframes with the following syntax; X = pd.DataFrame(IceCream.Temperature) and y = pd.DataFrame(IceCream.Revenue) The thing is that I do not know why this works but not the data series. I am taking a course on Machine Learning from SuperDataScience.com and the block of code at the top of this question worked for the instructor without having to convert the data series to dataframes. Any help will be greatly appreciated.