Keras Seq2Seq Introduction

Question

A Keras introduction to Seq2Seq model have been published a few weeks ago that can be found here. I do not really understand one part of this code:

decoder_lstm = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_outputs, _, _= decoder_lstm(decoder_inputs,initial_state=encoder_states)
decoder_dense = Dense(num_decoder_tokens, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)

Here the decoder_lstm is defined. It is a layer of dimension latent_dim. We use the states of the encoder as initial_state for the decoder.

What I do not understand is why a dense layer is then added after the LSTM layer and why it is working? The decoder is supposed to return all the sequence because of return_sequences = True, so how is it possible that adding a dense layer after is working?

I guess I miss something here.

score 2 · Accepted Answer · edited Jun 19 '18 at 01:48

Although the common cases use 2D data (batch,dim) as inputs for dense layers, in newer versions of Keras you can use 3D data (batch,timesteps,dim).

If you don't flatten this 3D data, your Dense layer will behave as if it would be applied to each of the time steps. And you will get outputs like (batch,timesteps,dense_units)

You can check these two little models below and confirm that independently of the time steps, both Dense layers have the same number of parameters, showing its parameters are suited only for the last dimension.

from keras.layers import *
from keras.models import Model
import keras.backend as K

#model with time steps    
inp = Input((7,12))
out = Dense(5)(inp)
model = Model(inp,out)
model.summary()

#model without time steps
inp2 = Input((12,))
out2 = Dense(5)(inp2)
model2 = Model(inp2,out2)
model2.summary()

The result will show 65 (12*5 + 5) parameters in both cases.

Keras Seq2Seq Introduction

1 Answers1