Context
I read some blogs about the implementation of stateful recurrent neural networks in Keras (for example here and here).
There are also several questions regarding stateful RNNs on stackoverflow, whereby this question comes close to mine.
The linked tutorials use the fit()
-method instead of fit_generator()
and pass states by manually iterating over the epochs with epochs=1
in fit()
like in this example taken from here:
# fit an LSTM network to training data
def fit_lstm(train, batch_size, nb_epoch, neurons):
X, y = train[:, 0:-1], train[:, -1]
X = X.reshape(X.shape[0], 1, X.shape[1])
model = Sequential()
model.add(LSTM(neurons, batch_input_shape=(batch_size, X.shape[1], X.shape[2]), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
for i in range(nb_epoch):
model.fit(X, y, epochs=1, batch_size=batch_size, verbose=0, shuffle=False)
model.reset_states()
return model
My question
I'd like to use fit_generator()
instead of fit()
, but also use stateless LSTM/GRU-layers.
What I was missing in other stackoverflow-questions like the one linked above is:
- Can I proceed in the same way as with
fit()
, meaning settingepochs=1
, and iterate over it for x times while settingmodel.reset_states()
in each iteration like in the example? - Or does
fit_generator()
already reset states only after finishingbatch_size
whenstateful=True
is used (what would be great)? - Or does
fit_generator()
reset states after each single batch (what would be a problem)?
The latter question deals in particular with this statement form here:
Stateless: In the stateless LSTM configuration, internal state is reset after each training batch or each batch when making predictions.
Stateful: In the stateful LSTM configuration, internal state is only reset when the reset_state() function is called.