multi layer LSTM net with stateful=True

Question

My question is does the this code make sense? And if this makes sense what should be the purpose?

model.add(LSTM(18, return_sequences=True,batch_input_shape=(batch_size,look_back,dim_x), stateful=True))
model.add(Dropout(0.3))
model.add(LSTM(50,return_sequences=False,stateful=False))
model.add(Dropout(0.3))
model.add(Dense(1, activation='linear'))

Because if my first LSTM layer returns my state from one batch to the next, why shouldn't do my second LSTM layer the same?

I'm having a hard time to understand the LSTM mechanics in Keras so I'm very thankful for any kind of help :)

And if you down vote this post could you tell me why in the commands? thanks.

kerastf · Accepted Answer · 2018-10-30T15:16:07.100

2

Your program is a regression problem where your model consists of 2 lstm layers with 18 and 50 layers each and finally a dense layer to show the regression value.

LSTM requires a 3D input.Since the output of your first LSTM layer is going to the input for the second LSTM layer.The input of the Second LSTM layer should also be in 3D. so we set the retrun sequence as true in 1st as it will return a 3D output which can then be used as an input for the second LSTM.

Your second LSTMs value does not return a sequence because after the second LSTM you have a dense layer which does not need a 3D value as input.

[update]

In keras by default LSTM states are reset after each batch of training data,so if you don't want the states to be reset after each batch you can set the stateful=True. If LSTM is made stateful final state of a batch will be used as an initial state for the next batch. You can later reset the states by calling reset_states()

edited Oct 30 '18 at 15:16

answered Oct 27 '18 at 10:20

kerastf

499
3
11

If i understand you Right you are declaring how return_sequence is working but I am more concernd with the sateful case – D.Luipers Oct 30 '18 at 09:52
In LSTM states are reset after each batch of training data,,so if you want the states to be not reset after each batch you can set the stateful=True,,,,,You can maintain states in both the layer of your lstm – kerastf Oct 30 '18 at 11:35
Yes. But can there be any Purpose to have one layer with stateful=True and the next one with stateful = False? – D.Luipers Oct 30 '18 at 12:39
The code simply says in the first layer the weights are not reset after each batch,,so the final state of a batch will be used as an initial state for the other batch...In the second layer the states are reset after each batch..Purpose is a very generic case and maybe they found its the best fit during hyper-parameter tuning. – kerastf Oct 30 '18 at 13:20
As far as i know only the states of the Memory cells are reseted not the weights of the lstm cell – D.Luipers Oct 30 '18 at 13:27
yeah you are correct,i meant to say states only but wrote it incorrectly – kerastf Oct 30 '18 at 14:02
Maybe you could update your answer so i can mark it as correct? – D.Luipers Oct 30 '18 at 14:58
@D.Luipers I have updated my answer to incorporate the stateful part – kerastf Oct 30 '18 at 15:16

multi layer LSTM net with stateful=True

1 Answers1