The two structures don't have the same nested structure while adding return_state=True over LSTM

Question

I don't know if it is kind of bug or an error. I have also reported this issue here.

The thing I am trying to do is that I want to make my custom LSTM statefull. So this code running fine without adding return_state=True. Once I add this to the code it raises this error : The two structures don't have the same nested structure.

This is a reproducible code:

from keras.layers import Lambda
import keras
import numpy as np
import tensorflow as tf
SEQUENCE_LEN = 45
LATENT_SIZE = 20
EMBED_SIZE = 50
VOCAB_SIZE = 100
BATCH_SIZE = 10
def rev_entropy(x):
        def row_entropy(row):
            _, _, count = tf.unique_with_counts(row)
            count = tf.cast(count,tf.float32)
            prob = count / tf.reduce_sum(count)
            prob = tf.cast(prob,tf.float32)
            rev = -tf.reduce_sum(prob * tf.log(prob))
            return rev

        nw = tf.reduce_sum(x,axis=1)
        rev = tf.map_fn(row_entropy, x)
        rev = tf.where(tf.is_nan(rev), tf.zeros_like(rev), rev)
        rev = tf.cast(rev, tf.float32)
        max_entropy = tf.log(tf.clip_by_value(nw,2,LATENT_SIZE))
        concentration = (max_entropy/(1+rev))
        new_x = x * (tf.reshape(concentration, [BATCH_SIZE, 1]))
        return new_x

inputs = keras.layers.Input(shape=(SEQUENCE_LEN,), name="input")

embedding = keras.layers.Embedding(output_dim=EMBED_SIZE, input_dim=VOCAB_SIZE, input_length=SEQUENCE_LEN, trainable=True)(inputs)
encoded = keras.layers.Bidirectional(keras.layers.LSTM(LATENT_SIZE,return_state=True), merge_mode="sum", name="encoder_lstm")(embedding)

encoded = Lambda(rev_entropy)(encoded)
decoded = keras.layers.RepeatVector(SEQUENCE_LEN, name="repeater")(encoded)
decoded = keras.layers.Bidirectional(keras.layers.LSTM(EMBED_SIZE, return_sequences=True,return_state=True), merge_mode="sum", name="decoder_lstm")(decoded)
autoencoder = keras.models.Model(inputs, decoded)
autoencoder.compile(optimizer="sgd", loss='mse')
autoencoder.summary()

x = np.random.randint(0, 90, size=(10, 45))
print(x.shape)

y = np.random.normal(size=(10, 45, 50))
print(y.shape)
history = autoencoder.fit(x, y, epochs=1)

Update1

After applying the idea of the comment tf.map_fn(row_entropy, encoded,dtype=tf.float32), I received a new error:

ValueError: Layer repeater expects 1 inputs, but it received 5 input tensors. Input received: [<tf.Tensor 'encoder_lstm/add_16:0' shape=(?, 20) dtype=float32>, <tf.Tensor 'encoder_lstm/while/Exit_3:0' shape=(?, 20) dtype=float32>, <tf.Tensor 'encoder_lstm/while/Exit_4:0' shape=(?, 20) dtype=float32>, <tf.Tensor 'encoder_lstm/while_1/Exit_3:0' shape=(?, 20) dtype=float32>, <tf.Tensor 'encoder_lstm/while_1/Exit_4:0' shape=(?, 20) dtype=float32>]

Also, consider that this error raises even without that lambda layer, So it seems there is something else wrong. If I try encoded.shape, it says encoded is a list with length 5 however it has to be a tensor with (batch_size, latent size)!!!

everything is fine without adding return_state=True Any help s appreciated!

Try changing `rev = tf.map_fn(row_entropy, x)` to `rev = tf.map_fn(row_entropy, encoded,dtype=tf.float32)`. — giser_yugang, Jun 09 '19 at 04:33
@giser_yugang Thank you so much again for helping out:). I changed it but I received an error exactly with this reproducible code: ValueError: Cannot reshape a tensor with 100 elements to shape [10,1] (10 elements) for 'lambda_1/Reshape' (op: 'Reshape') with input shapes: [5,20], [2] and with input tensors computed as partial shapes: input[1] = [10,1]. — sariii, Jun 09 '19 at 05:01
And the error happening in this line new_x = x * (tf.reshape(concentration, [BATCH_SIZE, 1])) — sariii, Jun 09 '19 at 05:06
The shape of `concentration` is `(5,20)`. But you want to reshape it to `(BATCH_SIZE,1)=(10,1)`. You need to change it according to your need. — giser_yugang, Jun 09 '19 at 05:07
@giser_yugang True, I had to change that one, I have one more question here, which is not coding, I hope you can help me with that also, If I have a custom layer (rev_entropy here which do not have any parameter just changing the matrix, do I need to apply the return_state=True? What about if my custom layer do have parameter....? — sariii, Jun 09 '19 at 05:17
I think in both cases I need to have the return_state= True otherwise it does not make sense. Please correct me if I'm wrong. Can you please also give a hint about how can I have that in the custom layer. — sariii, Jun 09 '19 at 05:41
I don't quite understand your question. What is the relationship between `return_state=True`and custom layer? Generally speaking, `seq2seq` task often uses the last hidden state. But you can also use all the hidden states if you want to do it. It doesn't matter that you post it as an answer by yourself. — giser_yugang, Jun 09 '19 at 05:53
@giser_yugang, So from what I learned if I want to keep the state of the all batches, LSTM has to be stateful. I thought if we design a custom layer, and that custom layer definitely output something, we might need to do something( which based on your explanations I guess the answer is NO). https://www.quora.com/What-is-the-difference-between-stateful-and-stateless-learning-in-LSTM — sariii, Jun 09 '19 at 16:10
Thank you so muchh for putting time,I really appreciate your help, :) — sariii, Jun 09 '19 at 16:12
Also, Can I ask you please have a look at this question as well: https://stackoverflow.com/questions/56433993/getting-error-while-adding-embedding-layer-to-lstm-autoencoder I have asked this one week before, it seems the preparation of my data has problem as I got the error during training not building the model, in this week I could not figure it out yet. sorry for asking this I feel I stuck on that problem and have literaly no clue as I tried whatever I found in the net. — sariii, Jun 09 '19 at 16:24
Actually, this answer did not help, I could not test it last night now I checked it, when we add return_state= True, the layer encoded(first one) becomes a list with length 5. I have updated the question even without lambda layer. it raises a new error. — sariii, Jun 09 '19 at 17:52
This is not a error. You need to refer [Understand the Difference Between Return Sequences and Return States for LSTMs in Keras](https://machinelearningmastery.com/return-sequences-and-return-states-for-lstms-in-keras/). — giser_yugang, Jun 10 '19 at 01:52
@giser_yugang I truly appreciate your time, the thing is that I am aware of the concept, I have read that link I can say three times before I start, but when it comes to put the concept into the code, the issues arise. What im saying is that I exactly know about the concept return sequences and return status. I feel it is okay to have both of them in the code, and I reviewd some of the code having the same thing, but error raises when I inject those concept in my implementation. — sariii, Jun 10 '19 at 14:10
I am open to any suggestion, like you wont give the answer but suggest book link whatever I greatly appreciate that and understand that the level of questions should not be like I have not studied it at all — sariii, Jun 10 '19 at 14:17

The two structures don't have the same nested structure while adding return_state=True over LSTM

0 Answers0