I am looking to build a pipeline that applies the hugging-face BART model step-by-step. Once I have built the pipeline, I will be looking to substitute the encoder attention heads with a pre-trained / pre-defined encoder attention head.
The pipeline I will be looking to implement is as follows:
- Tokenize input
- Run the tokenized input through the encoder with an adjusted attention layer
- Run the output through the decoder
- change output of decoder into text summary
At the moment my code looks like the below with comments where I am stuck
from transformers import AutoTokenizer, AutoModel, BartConfig, EncoderDecoderModel
article = """Text to be summarised."""
model_name = "facebook/bart-large-cnn"
# Values of dictionaries are tensors
attention_heads = {"Cars": cars_encoder_attention_layer,
"Countries": countries_encoder_attention_layer
}
model_name = "facebook/bart-large-cnn"
config = BartConfig.from_pretrained(model_name, output_hidden_states=True, output_attention=True)
tokenizer = AutoTokenizer.from_pretrained(model_name)
inputs = tokenizer(article, padding=True, truncation=True, return_tensors="pt")
model = AutoModel.from_pretrained(model_name)
model.config.output_attentions = True
outputs = model(**inputs)
# Overwrite the encoder attentions with the desired attention heads
outputs.encoder_attentions = attention_heads ["Cars"]
# Here I would take the overwritten encoder and insert into the decoder to generate the summary with the adjusted attention heads