I have trained the following LSTM Autoencoder:
Note: For training I set first dimension of input to None to allow for training and testing on data with different timesteps. I also use the Lambda layer instead of
repeat = keras.layers.RepeatVector(n=n_timesteps)
As it is required when using variable length timesteps value.
def repeat(x_inp):
x, inp = x_inp
x = tf.expand_dims(x,1)
x = tf.repeat(x, [tf.shape(inp)[1]],axis=1)
return x
n_timesteps = x_train.shape[1]
n_features = x_train.shape[2]
input_layer = keras.layers.Input(shape=(None, n_features))
lstm1 = keras.layers.LSTM(units=50, activation='tanh', name='LSTM_1', return_sequences=False)(input_layer)
dropout1 = keras.layers.Dropout(0.2)(lstm1)
repeat = keras.layers.Lambda(repeat)([dropout1, input_layer])
lstm2 = keras.layers.LSTM(units=50, activation='tanh', name='LSTM_2', return_sequences=True(repeat)
dropout2 = keras.layers.Dropout(0.2)(lstm2)
out = keras.layers.TimeDistributed(keras.layers.Dense(units=n_features))(dropout2)
train_model = keras.Model(input_layer, outputs=out)
train_model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.001), loss="mse")
It is a stateless model. I have saved the train_model and the weights and then defined exact same model called predict_model with a few differences:
predict_batch_size = 1
input_layer = keras.layers.Input(batch_shape=(predict_batch_size, None, n_features), name='Encoder_Input')
After defining the predict_model I then load the weights from the train_model used for training.
Now I want to perform real-time predictions on sample of size (1,1,2)
for that I use
x_test_pred_single = predict_model.predict_on_batch(x_test_single)
The problem is that when I compare the results from this type of prediction to one using batch_size>1 and timestep equal to that during training I get different results.
As you can see the blue graph represents real data, green and orange overlay with each other, red uses single step prediction in a loop but with stateless model, and the purple graph shows the result of method described in this post.
Question Does anyone know why the real-time sample by sample predictions do not give me good results?