Difference between 2 LSTM Autoencoders

Question

I would like to know the difference between these 2 Models. the one above has 4 Layers looking into the model summary and you can also define the unit numbers for dimensionality reduction. But what is with the 2nd Model it has 3 layers and you cant directly define the number of hidden units? Are both LSTM Autoencoders for dimensionality reduction and regression analysis ? Are there any good papers describing these two examples that I found from keras andhere. I did nowhere defined the variables, infact that I am not asking for a coding question directly. I hope this also a good place for this topic. 1. Model:

from keras.layers import *
from keras.models import Model
from keras.layers import Input, LSTM, Dense, RepeatVector

samples=1000
timesteps=300
features=input_dim=1
data_shape=np.reshape(data,(samples,timestep,input_dim)

inputs = Input(shape=(timestep, input_dim))
encoded = LSTM(units, return_sequences=False, name="encoder")(inputs)
decoded = RepeatVector(timestep)(encoded)
decoded = LSTM(input_dim, return_sequences=True, name='decoder')(decoded)
autoencoder = Model(inputs, decoded)
encoder = Model(inputs, encoded)
print (autoencoder.summary())

2. Model:

x = np.random.random((1000, 300, 1))

2.model:

m = Sequential()
m.add(LSTM(100, input_shape=(300,1)))
m.add(RepeatVector(300))
m.add(LSTM(100, return_sequences=True))
print (m.summary())
m.compile(loss='mse', optimizer='rmsprop', metrics=['mse', 'mape'])
history = m.fit(x, x, nb_epoch=2000, batch_size=100)

When I try to add to both of them a data with the shape e.g. (1000, 300, 1) the first one is accepting it the second not, I get the error expected lstm_4 to have shape (None, 300, 100) but got array with shape (1000, 300, 1). With the choosen input_dim 1 and units =100. what am I doing wrong ? This is what I want to be:

LSTM(100, input_shape=(300, 1))

with units=100 When I run the model, I get the following error: Error when checking target: expected lstm_2 to have shape (None, 300, 100) but got array with shape (1000, 300, 1)

Where is my mistake that the model does not accept my data shape and my units size?

score 0 · Answer 1 · answered Apr 28 '18 at 19:21

0

The number of units for the LSTM layers in the second model is the first argument to its initializer, which is 2. That is, if you let timestep = 10, input_dim = 2, and units = 2, then the two models are exactly equivalent.

answered Apr 28 '18 at 19:21

fuglede

17,388
2
54
99

When I try to add to both of them a data with the shape e.g. (1000, 300, 1) the first one is accepting it the second not, I get the error expected lstm_4 to have shape (None, 300, 100) but got array with shape (1000, 300, 1). With the choosen input_dim 1 and units =100. what am I doing wrong ? – annstudent93 Apr 28 '18 at 20:11
Sounds like you're using `LSTM(1, input_shape=(300, 100))` instead of `LSTM(100, input_shape=(300, 1))`. – fuglede Apr 28 '18 at 20:20
I am correcting it: I used the shape you suggested, but still the same error. – annstudent93 Apr 28 '18 at 20:46
When I do LSTM(1, input_shape=(300, 1)), than it is working but thats not what I want, I have one feature and input size of 300 to be reduced to 100 units. – annstudent93 Apr 28 '18 at 21:00
[The first parameter to `LSTM` is the number of units](https://keras.io/layers/recurrent/#lstm); could you update your question with the error you got from `LSTM(100, input_shape=(300, 1))`? – fuglede Apr 29 '18 at 08:19
That error should never occur if you used `LSTM(100, input_shape=(300, 1))`; please provide your full model including the relevant data. If you use that layer and let e.g. `x = np.random.random((1000, 300, 1))`, then you'll find that e.g. `m.predict(x)` works as expected, `m.predict(x).shape` being `(1000, n, m)`, where `n` is the number of repetitions used in `RepeatLayer`, and `m` being the number of units used in the second instance of `LSTM`. As long as your input shape is `(None, 300, 1)`, the model will compile regardless of the choice of units in the first `LSTM`. – fuglede Apr 29 '18 at 10:46
full modell with random data as you suggested, but still not compiling – annstudent93 Apr 29 '18 at 11:36
The compilation runs fine here. The fitting, however, does not. You can't use your input data as the target data, i.e. the second parameter in `m.fit`, since the shape of the output of the model differs from the input (as evident from the final line in the "Output shape" column in your model summary). – fuglede Apr 29 '18 at 11:43
I checked the model summary, you are right. what would you say to do than, what am I have to improve? – annstudent93 Apr 29 '18 at 11:48
If you want to achieve the same effect as in the first model, rather than `m.add(LSTM(100, return_sequences=True))`, use `m.add(LSTM(1, return_sequences=True))` – fuglede Apr 29 '18 at 11:52
For the second `LSTM` layer, yes. – fuglede Apr 29 '18 at 12:05
thank you, I will try both out and bring my solutions later online for sharing – annstudent93 Apr 29 '18 at 12:06

score 0 · Accepted Answer · answered Apr 28 '18 at 19:23

The two models have no structural difference; they both consist of an encoder followed by a decoder implemented by LSTM layers. The difference is notational; the first model is defined on the functional API with the input being considered a layer, whereas the second is defined using the sequential API. As for the encoder-decoder (otherwise known as seq2seq) architecture, it was originally proposed here, and has since evolved greatly, with the most significant improvement being the attention layer.

Difference between 2 LSTM Autoencoders

2 Answers2