0

I'm currently trying to make predictions for the next months worth of business days for stock prices pulled from Quandl, I got this idea from a tutorial on pythonprogramming.net (which heavily influences the structure of the code here), however when I attempt to print the predictions I feel as though it's making predictions on the wrong column in the data frame due to it always predicting an initial massive drop in stock price.

df = quandl.get("GOOG/FRA_BHP1", authtoken=MyAuthToken)
df = df.drop(['Open', 'High', 'Low', 'Volume'], axis=1)
print(df.tail())

forecast_col = 'Close'
df.fillna(value=-99999, inplace=True)
forecast_out = int(22)
df['label'] = df[forecast_col].shift(-forecast_out)

X = np.array(df.drop(['label'], 1))
X = preprocessing.scale(X)
X_lately = X[-forecast_out:]
X = X[:-forecast_out]

df.dropna(inplace=True)

y = np.array(df['label'])    

X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.2)

clf = SGDRegressor()
clf.fit(X_train, y_train)
confidence = clf.score(X_test, y_test)
forecast_set = clf.predict(X_lately)
confidence = np.round(confidence, 2)

print(forecast_set, confidence, forecast_out)
print(df)

I initially felt that this was happening due to the index being dates and times as the shift seems to have removed the end of the data instead of the beginning, but now I am unsure on the problem or how to fix this.

Thank you so much for any replies or suggestions, I appreciate any help you are willing to give me :)

edit: I feel I should be more specific on what i believe the problem is, the last 22 items in the data frame are being removed in the 'Close' column instead of the beginning 22 items, and due to this the predictions are being made based off of month old data. Hence i feel that it should be predicting on the labels column instead of the 'Close' column.

  • I'm not exactly sure what you're trying to do here; from what you've written, it seems you're trying to predict the price of GOOG in 22 days, given the current price of the stock. Is this correct? – apnorton Jan 27 '17 at 05:45
  • Sort of, I'm predicting BHP stocks for the next 22 business days from data from google finance on quandl :) – Connor McCluskey Jan 27 '17 at 09:00

0 Answers0