Masking inputs in bidirectional lstm in keras

Question

I am training an LSTM in Keras:

model2 = Sequential(name="LSTM-Model") # Model
model2.add(Input(shape=(X_train_tensor.shape[1],X_train_tensor.shape[2]), name='Input-Layer')) # Input Layer - need to speicfy the shape of inputs\
model2.add(Masking(mask_value=-1000, input_shape=(timestep, 1)))
model2.add(Bidirectional(LSTM(units=64, activation='relu', recurrent_activation='sigmoid', stateful=False), name='Hidden-LSTM-Encoder-Layer')) # Encoder Layer

model2.add(RepeatVector(Y_train.shape[1], name='Repeat-Vector-Layer')) # Repeat Vector

model2.add(Bidirectional(LSTM(units=64, activation='relu', recurrent_activation='sigmoid', stateful=False, return_sequences=True), name='Hidden-LSTM-Decoder-Layer')) # Decoder Layer

model2.add(TimeDistributed(Dense(units=1, activation='linear'), name='Output-Layer')) # Output Layer, Linear(x) = x

However, it appears the -1000 values in the sequence are not being masked and as such inflating the metrics (loss,MSE). If I want to ignore either nan or -1000 values in the input data, how can I mask or deal with them being passed in?

The data would look like this:

[1,2,3,-1000,5,6,-1000,8]

I am trying to get the model to skip the -1000s.

What are the values of `X_train_tensor.shape[1]`,`X_train_tensor.shape[2]` , `timestep` — thushv89, Aug 11 '22 at 04:19
@thushv89 `X_train_tensor.shape[1]=10` , `X_train_tensor.shape[2]=1` , `timestep=10`. The model works without a masking layer and without -1000 values within time series but when I add -1000 values and try masking it does not work accurately. Also the series I wrote above is just an example. You can assume the time series is fully filled with a sequence of 10. — Jay Upadhyay, Aug 11 '22 at 04:50
So `[1,2,3,-1000,5,6,-1000,8]` is a single sequence of data, or features of one time step? — thushv89, Aug 11 '22 at 05:58
@thushv89 I think both. The idea is we send this sequence into lstm and get a sequence back. The way I believe my model works is by sending the sequence `[1,2,3]` and trying to predict/verifying `[4,5,6]` as Y_train. Basically, the time series is the previous time as a feature. I'm new to machine learning so I hope that answers the question adequately. — Jay Upadhyay, Aug 11 '22 at 07:55
So a feature to the model at one time step is a single float vlaue? — thushv89, Aug 11 '22 at 08:12
@thushv89 Well it is sequence to sequence. I think the timestep is a whole slice of values. In my case, it is 10 values inputted to predict another 10 values. — Jay Upadhyay, Aug 11 '22 at 08:54
I've provided a high level answer here, https://stackoverflow.com/questions/73322015/mask-layer-is-not-working-with-mlps-how-to-add-a-custom-layer-with-masking/73327137#73327137 which is related to your problem. — thushv89, Aug 11 '22 at 21:40

Masking inputs in bidirectional lstm in keras

0 Answers0