6

In Keras the Bidirectional wrapper for RNNs also supports stateful=true. I don't really understand how this is supposed to work:

In a stateful unidirectional model the state of a batch is carried over to the next batch. I guess it works the same for the forward layer in the bidirectional model.

But where is the backward layer getting it's states from? If I understand everything correctly it should technically recieve it's state from the "next" batch. But obviously the "next" batch is not computet yet, so how does it work?

birnbaum
  • 4,718
  • 28
  • 37

1 Answers1

2

One may think about a Bidirectional layer in a following manner:

forward = Recurrent(..)(input)
backward = Recurrent(..., reverse_input=True)(input)
output = merge([forward, backward], ...)

So - as you can see - you are losing the temporal orientation. You are analysing the input both from its beginning and end. In this case - setting stateful=True simply takes its starting state from a previous sample accordingly to direction of a bidirectional branch (forward takes from forward, backward takes from backward).

This makes your model losing the interpretation - that samples from concurrent batches might be interpreted as a compact sequence divided into batches.

Marcin Możejko
  • 39,542
  • 10
  • 109
  • 120