0

I have an encoder-decoder model whose structure is the same as the one at machinelearningmastery.com with num_encoder_tokens = 1949, num_decoder_tokens = 1944, and latent_dim = 2048.

I would like to construct the encoder and decoder models by loading the already trained model and try decoding some samples, but I get the error "Graph disconnected: cannot obtain value for tensor Tensor("input_1_1:0", shape=(?,?, 1949), dtype=float32) at layer "input_1". The following previous layers were accessed without issue: [].

Part of my code is the following:

encoder_inputs = Input(shape=(None, num_encoder_tokens))
encoder = LSTM(latent_dim, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
encoder_states = [state_h, state_c]

decoder_inputs = Input(shape=(None, num_decoder_tokens))
decoder_lstm = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs,
                                     initial_state=encoder_states)
decoder_dense = Dense(num_decoder_tokens, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)

model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
model.fit([encoder_input_data, decoder_input_data], decoder_target_data,
          batch_size=batch_size,
          epochs=epochs,
          validation_split=0.2)
model.save('modelname.h5')

# ...from here different python file for inference...

encoder = LSTM(latent_dim, return_state=True)
model = load_model('modelname.h5')
encoder_model = Model(model.output, encoder(model.output)) # I get the error here

And what I would like to do here is:

encoder_inputs = Input(shape=(None, 1949))
encoder = LSTM(2048, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
encoder_states = [state_h, state_c]
encoder_model = Model(encoder_inputs, encoder_states)

I would highly appreciate it if anyone could help me.

namacha
  • 1
  • 1
  • Do you get a warning when saving your model (if I recall it correctly about non-hashable value etc)? – Eypros Mar 01 '19 at 13:38
  • Thanks for the comment. It was saved without a problem and I get the error when declaring `encoder_model` for inference (I added a comment in the code). – namacha Mar 01 '19 at 13:42
  • Have you declared the encoder/decoder models also as mentioned in the link? – Eypros Mar 01 '19 at 14:21
  • Yes, but I declare the decoder model after the encoder model, so I haven't reached the decoder model because I'm stuck with the error at the encoder declaration – namacha Mar 01 '19 at 15:47
  • It seems that you are declaring encoder/decoder model just for inference. You should declare them for training also. Have you done that? – Eypros Mar 01 '19 at 17:41
  • I'm sorry for the lack of information, but yes I declared both encoder and decoder for training and inference. The structure of the vectors are the same. – namacha Mar 01 '19 at 18:36

1 Answers1

0

Take a look at Robert Sim's answer to this post in stack overflow: Restore keras seq2seq model

And to this post in github: https://github.com/keras-team/keras/pull/9119.

He also provides an example in: https://github.com/simra/keras/blob/simra/s2srestore/examples/lstm_seq2seq_restore.py where you can see how the model is loaded. The following code has been taken from that example.

# Restore the model and construct the encoder and decoder.
model = load_model('s2s.h5')

encoder_inputs = model.input[0]   # input_1
encoder_outputs, state_h_enc, state_c_enc = model.layers[2].output   # lstm_1
encoder_states = [state_h_enc, state_c_enc]
encoder_model = Model(encoder_inputs, encoder_states)

decoder_inputs = model.input[1]   # input_2
decoder_state_input_h = Input(shape=(latent_dim,), name='input_3')
decoder_state_input_c = Input(shape=(latent_dim,), name='input_4')
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
decoder_lstm = model.layers[3]
decoder_outputs, state_h_dec, state_c_dec = decoder_lstm(
    decoder_inputs, initial_state=decoder_states_inputs)
decoder_states = [state_h_dec, state_c_dec]
decoder_dense = model.layers[4]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model(
    [decoder_inputs] + decoder_states_inputs,
    [decoder_outputs] + decoder_states)
kevin
  • 988
  • 12
  • 23