I created this program in march and it worked fine then, but now it has an error and I can't figure out why.
here is the current non working code (I coded this on Jupiter notebook)
import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import datetime
import seaborn
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn.linear_model import LinearRegression
pd.options.mode.chained_assignment = None # default='warn'
df = yf.download("spy")
df.to_csv('spy.csv')
df = df[['Adj Close']]
plt.plot(df)
df['Adj Close'].plot(figsize=(15,6), color = 'g')
plt.legend(loc='upper left')
plt.show()
forecast = 70
df['Prediction'] = df[['Adj Close']].shift(-forecast)
X = np.array(df.drop(['Prediction'], 1))
X = preprocessing.scale(X)
X_forecast = X[-forecast:]
X = X[:-forecast]
y = np.array(df['Prediction'])
y = y[:-forecast]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
clf = LinearRegression()
clf.fit(X_train, y_train)
confidence = clf.score(X_test, y_test)
confidence
forecast_predicted = clf.predict(X_forecast)
print(forecast_predicted)
plt.plot(X, y)
dates = pd.date_range(start="2021-05-21", end= "2021-06-19")
plt.plot(dates, forecast_predicted, color='b')
df['Adj Close'].plot(color='g')
plt.xlim(xmin = datetime.date(2020,5,1))
plt.xlim(xmax = datetime.date(2021,7,1))
I know the error is in the last part of the code. here is how the last part of the code looked when it was working on march 15.
dates = pd.date_range(start="2021-03-16", end= "2021-04-14")
plt.plot(dates, forecast_predicted, color='b')
df['Adj Close'].plot(color='g')
plt.xlim(xmin = datetime.date(2020,3,1))
plt.xlim(xmax = datetime.date(2021,5,1))