How do I quantify the attention between input and output sentences in a sequence-to-sequence language modelling scenario [translation or summarization]?
For instance, consider these input and output statements, i.e., document is the input, and abstract is the output for a sequence-to-sequence task.
# INPUT SENTENCES
document = [
"This paper covers various aspects of learning.",
"We will dive deep into algorithms.",
"It's crucial to understand the basics.",
"Modern techniques are also covered." ]
# OUTPUT SENTENCES
abstract = [
"The paper discusses machine learning.",
"We focus on deep learning techniques.",
"Results indicate superior performance.", ]
How do I generate a heatmap containing each document sentence's attention values with its abstract sentence. In other words, how much attention does each document sentence pay to each abstract sentence. In this case, the heatmap should be a 4x3 matrix.
Any ideas on how this can be done?
I'm using a FLAN-T5-LARGE model. I load the model and tokenizer. This is an example of me trying to quantify the attention between 2 sentences.
encoder_input_ids = tokenizer(
"The study focuses on machine learning.",
return_tensors="pt",
add_special_tokens=True,
).input_ids
decoder_input_ids = tokenizer(
text_target="This paper presents a novel approach in this domain.",
return_tensors="pt",
add_special_tokens=True,
).input_ids
# Forward pass to get attention weights
with torch.no_grad():
# Forward Pass of the Model
outputs = model(
input_ids=encoder_input_ids,
decoder_input_ids=decoder_input_ids,
)
# Getting the cross attention values of the last decoder block
last_block = outputs.cross_attentions[-1]
# Shape of last_block - torch.Size([1, 16, 12, 9]) where 1=batch_size, 16=num_attn_heads, 12=decoder_inp_seq_len, 9=encoder_inp_seq_len
I thought I could just last_block.mean()
to obtain the average attention but I don't think this is the right approach.