0

I am try to predict the stock price of IBM. but i have gottchas on handling the date column field for model training in a linear regression algorithm. this is how my dataset looks like:

         Date      Open      High       Low     Close  Adj Close  Volume
0  1962-01-02  7.713333  7.713333  7.626667  7.626667   0.618153  387200
1  1962-01-03  7.626667  7.693333  7.626667  7.693333   0.623556  288000
2  1962-01-04  7.693333  7.693333  7.613333  7.616667   0.617343  256000
3  1962-01-05  7.606667  7.606667  7.453333  7.466667   0.605185  363200
4  1962-01-08  7.460000  7.460000  7.266667  7.326667   0.593837  544000

my code is:

from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import TimeSeriesSplit
from sklearn.linear_model import LogisticRegression
import pandas as pd
import numpy as np

df = pd.read_csv('IBM.csv')

df['Date'] = pd.to_datetime(df.Date)
df.set_index('Date', inplace=True)


X = df.drop('Adj Close', axis='columns')
Y = df['Adj Close']
scaler = MinMaxScaler()

X = pd.DataFrame(scaler.fit_transform(X), columns=X.columns)

timesplit= TimeSeriesSplit(n_splits=10)
for train_index, test_index in timesplit.split(X):
        X_train, X_test = X[train_index], X[test_index]
        y_train, y_test = Y[train_index], Y[test_index]

I got an error:

KeyError: "None of [Int64Index([   0,    1,    2,    3,    4,    5,    6,    7,    8,    9,\n            ...\n            1323, 1324, 1325, 1326, 1327, 1328, 1329, 1330, 1331, 1332],\n           dtype='int64', length=1333)] 
are in the [columns]"

even when i managed to get it to work am unable to train my model.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
geek
  • 307
  • 2
  • 10

1 Answers1

0

Your syntax for slicing X and Y dataframes row wise is actually trying to slice them column wise.

See Pandas documentation on indexing and selecting data.

Try replacing:

  • X_train, X_test = X[train_index], X[test_index]

With:

  • X_train, X_test = X.loc[train_index, :], X.loc[test_index, :]

Doing so, your code runs fine.

Laurent
  • 12,287
  • 7
  • 21
  • 37