0

I have a BERT-based encoder model (encoder) and I want to input the last hidden state output of this to a GPT2-based model (decoder). There are no options in transformers.GPT2Config to use encoder's last hidden layer as input to GPT2. How do I achieve this?

I want something like this:

inputs = input_ids, token_type_ids, labels, attention_mask

encoder           = RobertaForMaskedLM(config=encoder_config)
encoder_output    = encoder(**inputs)
last_hidden_layer = encoder_output.hidden_states[-1]

decoder           = GPT2LMHeadModel(config=decoder_config)
decoder_output    = decoder(**inputs, last_hidden_layer) 

where the last_hidden_layer is used as encoder-decoder attention to each transformer unit in GPT2.

MNK
  • 634
  • 4
  • 18

0 Answers0