1

Hello a StackOverflow community!

I'm trying to create an inference model for a seq2seq (Encoded-Decoded) model with Attention. It's a definition of the inference model.

model = compile_model(tf.keras.models.load_model(constant.MODEL_PATH, compile=False))

encoder_input = model.input[0]
encoder_output, encoder_h, encoder_c = model.layers[1].output
encoder_state = [encoder_h, encoder_c]
encoder_model = tf.keras.Model(encoder_input, encoder_state)

decoder_input = model.input[1]
decoder = model.layers[3]
decoder_new_h = tf.keras.Input(shape=(n_units,), name='input_3')
decoder_new_c = tf.keras.Input(shape=(n_units,), name='input_4')
decoder_input_initial_state = [decoder_new_h, decoder_new_c]

decoder_output, decoder_h, decoder_c = decoder(decoder_input, initial_state=decoder_input_initial_state)
decoder_output_state = [decoder_h, decoder_c]

# These lines cause an error
context = model.layers[4]([encoder_output, decoder_output])
decoder_combined_context = model.layers[5]([context, decoder_output])
output = model.layers[6](decoder_combined_context)
output = model.layers[7](output)
# end

decoder_model = tf.keras.Model([decoder_input] + decoder_input_initial_state, [output] + decoder_output_state)
return encoder_model, decoder_model

When I run this code the following error is coming.

ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_5:0", shape=(None, None, 20), dtype=float32) at layer "lstm_4". The following previous layers were accessed without issue: ['lstm_5']

If I exclude an attention block, the model will be form without any errors at all.

model = compile_model(tf.keras.models.load_model(constant.MODEL_PATH, compile=False))

encoder_input = model.input[0]
encoder_output, encoder_h, encoder_c = model.layers[1].output
encoder_state = [encoder_h, encoder_c]
encoder_model = tf.keras.Model(encoder_input, encoder_state)

decoder_input = model.input[1]
decoder = model.layers[3]
decoder_new_h = tf.keras.Input(shape=(n_units,), name='input_3')
decoder_new_c = tf.keras.Input(shape=(n_units,), name='input_4')
decoder_input_initial_state = [decoder_new_h, decoder_new_c]

decoder_output, decoder_h, decoder_c = decoder(decoder_input, initial_state=decoder_input_initial_state)
decoder_output_state = [decoder_h, decoder_c]

# These lines cause an error
# context = model.layers[4]([encoder_output, decoder_output])
# decoder_combined_context = model.layers[5]([context, decoder_output])
# output = model.layers[6](decoder_combined_context)
# output = model.layers[7](output)
# end

decoder_model = tf.keras.Model([decoder_input] + decoder_input_initial_state, [decoder_output] + decoder_output_state)
return encoder_model, decoder_model

1 Answers1

0

I think you also need to take the encoder output as output from the encoder model and then give it as input to the decoder model as the attention part requires it. Maybe this changes could help-

model = compile_model(tf.keras.models.load_model(constant.MODEL_PATH, compile=False))
encoder_input = model.input[0]
encoder_output, encoder_h, encoder_c = model.layers[1].output
encoder_state = [encoder_h, encoder_c]
encoder_model = tf.keras.Model(inputs=[encoder_input],outputs=[encoder_state,encoder_output])

decoder_input = model.input[1]
decoder_input2 = tf.keras.Input(shape=x) #where x is the shape of encoder output
decoder = model.layers[3]
decoder_new_h = tf.keras.Input(shape=(n_units,), name='input_3')
decoder_new_c = tf.keras.Input(shape=(n_units,), name='input_4')
decoder_input_initial_state = [decoder_new_h, decoder_new_c]

decoder_output, decoder_h, decoder_c = decoder(decoder_input, initial_state=decoder_input_initial_state)
decoder_output_state = [decoder_h, decoder_c]

context = model.layers[4]([decoder_input2, decoder_output])
decoder_combined_context = model.layers[5]([context, decoder_output])
output = model.layers[6](decoder_combined_context)
output = model.layers[7](output)

decoder_model = tf.keras.Model([decoder_input,decoder_input2,decoder_input_initial_state], [output] + decoder_output_state)`  

  • You shouldn't answer in comments; better edit your answer to add these details. – mustaccio Jul 18 '20 at 13:12
  • @ValayBundele An inference model have been form correctly. But now I can't to pass a full tensor of attention into the decoder model as I use inference process is taking the tokens from input sequence by order. Another words if I try to pass a target tensor sequence with an attention tensor sequence into the decoder inference model, I'll got the following error message. – Nikita Tolstykh Jul 20 '20 at 01:28
  • **ValueError: Dimension 1 in both shapes must be equal, but are 142 and 1. Shapes are [?,142] and [?,1]. for '{{node functional_3/concatenate/concat}} = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32](functional_3/attention/MatMul_1, functional_3/lstm_1/PartitionedCall:1, functional_3/concatenate/concat/axis)' with input shapes: [?,142,128], [?,1,128], [] and with computed input tensors: input[2] = <2>.** – Nikita Tolstykh Jul 20 '20 at 01:31