2

I have built a multi-step, multi-variate LSTM model to predict the target variable 5 days into the future with 5 days of look-back. The model runs smooth (even though it has to be further improved), but I cannot correctly invert the transformation applied, once I get my predictions. I have seen on the web that there are many ways to pre-process and transform data. I decided to follow these steps:

  1. Data fetching and cleaning
df = yfinance.download(['^GSPC', '^GDAXI', 'CL=F', 'AAPL'], period='5y', interval='1d')['Adj Close'];
df.dropna(axis=0, inplace=True)
df.describe()

Data set table

enter image description here

  1. Split the data set into train and test
size = int(len(df) * 0.80)
df_train = df.iloc[:size]
df_test = df.iloc[size:]
  1. Scaled train and test sets separately with MinMaxScaler()
scaler = MinMaxScaler(feature_range=(0,1))
df_train_sc = scaler.fit_transform(df_train)
df_test_sc = scaler.transform(df_test)
  1. Creation of 3D X and y time-series compatible with the LSTM model

I borrowed the following function from this article

def create_X_Y(ts: np.array, lag=1, n_ahead=1, target_index=0) -> tuple:
    """
    A method to create X and Y matrix from a time series array for the training of 
    deep learning models 
    """
    # Extracting the number of features that are passed from the array 
    n_features = ts.shape[1]
    
    # Creating placeholder lists
    X, Y = [], []

    if len(ts) - lag <= 0:
        X.append(ts)
    else:
        for i in range(len(ts) - lag - n_ahead):
            Y.append(ts[(i + lag):(i + lag + n_ahead), target_index])
            X.append(ts[i:(i + lag)])

    X, Y = np.array(X), np.array(Y)

    # Reshaping the X array to an RNN input shape 
    X = np.reshape(X, (X.shape[0], lag, n_features))

    return X, Y

#In this example let's assume that the first column (AAPL) is the target variable.

trainX,trainY = create_X_Y(df_train_sc,lag=5, n_ahead=5, target_index=0)
testX,testY = create_X_Y(df_test_sc,lag=5, n_ahead=5, target_index=0)
  1. Model creation
def build_model(optimizer):
    grid_model = Sequential()
    grid_model.add(LSTM(64,activation='tanh', return_sequences=True,input_shape=(trainX.shape[1],trainX.shape[2])))
    grid_model.add(LSTM(64,activation='tanh', return_sequences=True))
    grid_model.add(LSTM(64,activation='tanh'))
    grid_model.add(Dropout(0.2))
    grid_model.add(Dense(trainY.shape[1]))
    grid_model.compile(loss = 'mse',optimizer = optimizer)
    return grid_model

grid_model = KerasRegressor(build_fn=build_model,verbose=1,validation_data=(testX,testY))
parameters = {'batch_size' : [12,24],
              'epochs' : [8,30],
              'optimizer' : ['adam','Adadelta'] }
grid_search  = GridSearchCV(estimator = grid_model,
                            param_grid = parameters,
                            cv = 3)

grid_search = grid_search.fit(trainX,trainY)
grid_search.best_params_
my_model = grid_search.best_estimator_.model

  1. Get predictions
yhat = my_model.predict(testX)
  1. Invert transformation of predictions and actual values

Here my problems begin, because I am not sure which way to go. I have read many tutorials, but it seems that those authors prefer to apply MinMaxScaler() on the entire dataset before splitting the data into train and test. I do not agree on this, because, otherwise, training data will be incorrectly scaled with information we should not use (i.e. the test set). So, I followed my approach, but I am stucked here.

I found this possible solution on another post, but it's not working for me:

# invert scaling for forecast
pred_scaler = MinMaxScaler(feature_range=(0, 1)).fit(df_test.values[:,0].reshape(-1, 1))
inv_yhat = pred_scaler.inverse_transform(yhat)
# invert scaling for actual
inv_y = pred_scaler.inverse_transform(testY)

In fact, when I double check the last values of the target from my original data set they don't match with the inverted scaled version of the testY.

Can someone please help me on this? Many thanks in advance for your support!

James Z
  • 12,209
  • 10
  • 24
  • 44
Luca
  • 21
  • 2
  • Why are you apply an inverse transform? Why not standard scaler for normalization before train test split and call it good – Golden Lion Feb 11 '22 at 09:51
  • 1
    Because otherwise you'll scale the portion of the data set used for training with information that you are not supposed to have (i.e. the test set portion). – Luca Feb 15 '22 at 09:25

1 Answers1

1

Two things could be mentioned here. First, you cannot inverse transform something you did not see. This happens because you use two different scalers. The NN will predict values in the range of Scaler 1, where it is not said that this lies within the range of Scaler 2 (scaled on test data). Second, the best practice is to fit your scaler on the training set and use the same scaler (only transform) on the test data as well. Now, you should be able to reverse transform your test results. Third if scaling wents off, because the test set has completely different values - e.g. happens with live streaming data, it is up to you to deal with it, e.g. the min-max scaler will produce values > 1.0.

lkaupp
  • 551
  • 1
  • 6
  • 17
  • I did exactly what you wrote: in fact, in section 3 I used one scaler (the MinMax Scaler) and then fit_transform the training set, while just transform the test set. --> df_train_sc = scaler.fit_transform(df_train) df_test_sc = scaler.transform(df_test). The problem is that when I try to inverse transform the y_hat variable I get an error, because the shape of it does not match the one of the original scaled df_test. Do you know how can I deal with it? – Luca Feb 15 '22 at 09:20