2

Context

I read some blogs about the implementation of stateful recurrent neural networks in Keras (for example here and here).
There are also several questions regarding stateful RNNs on stackoverflow, whereby this question comes close to mine.

The linked tutorials use the fit()-method instead of fit_generator() and pass states by manually iterating over the epochs with epochs=1 in fit() like in this example taken from here:

# fit an LSTM network to training data
def fit_lstm(train, batch_size, nb_epoch, neurons):
    X, y = train[:, 0:-1], train[:, -1]
    X = X.reshape(X.shape[0], 1, X.shape[1])
    model = Sequential()
    model.add(LSTM(neurons, batch_input_shape=(batch_size, X.shape[1], X.shape[2]), stateful=True))
    model.add(Dense(1))
    model.compile(loss='mean_squared_error', optimizer='adam')
    for i in range(nb_epoch):
        model.fit(X, y, epochs=1, batch_size=batch_size, verbose=0, shuffle=False)
        model.reset_states()
    return model


My question

I'd like to use fit_generator() instead of fit(), but also use stateless LSTM/GRU-layers. What I was missing in other stackoverflow-questions like the one linked above is:

  1. Can I proceed in the same way as with fit(), meaning setting epochs=1, and iterate over it for x times while setting model.reset_states() in each iteration like in the example?
  2. Or does fit_generator() already reset states only after finishing batch_size when stateful=True is used (what would be great)?
  3. Or does fit_generator() reset states after each single batch (what would be a problem)?

The latter question deals in particular with this statement form here:

Stateless: In the stateless LSTM configuration, internal state is reset after each training batch or each batch when making predictions.
Stateful: In the stateful LSTM configuration, internal state is only reset when the reset_state() function is called.

Markus
  • 2,265
  • 5
  • 28
  • 54

0 Answers0