I am using T5-Large by HuggingFace for inference. Given a premise and a hypothesis, I need to determine whether they are related or not. So, if I feed a string "mnli premise: This game will NOT open unless you agree to them sharing your information to advertisers. hypothesis: Personal data disclosure is discussed."
the model is supposed to return either entailment
, neutral
, or contradiction
.
Though I am able to determine the result, I am unable to determine the probability of the sequence generated. For instance, consider the model will generate entailment
for the example given above. I also want to know what is the probability of entailment
. So far, I have been using the following code,
from transformers import T5Tokenizer, T5ForConditionalGeneration
def is_entailment(premise, hypothesis):
entailment_premise = premise
entailment_hypothesis = hypothesis
token_output = tokenizer("mnli premise: " + entailment_premise + " hypothesis: " + entailment_hypothesis,
return_tensors="pt", return_length=True)
input_ids = token_output.input_ids
output = model.generate(input_ids, output_scores=True, return_dict_in_generate=True, max_new_tokens=15)
entailment_ids = output["sequences"]
entailment = tokenizer.decode(entailment_ids[0], skip_special_tokens=True)
return entailment
tokenizer = T5Tokenizer.from_pretrained('t5-small')
model = T5ForConditionalGeneration.from_pretrained('t5-small', return_dict=True)
premise = "This game will NOT open unless you agree to them sharing your information to advertisers."
hypothesis = "Personal data disclosure is discussed."
print(is_entailment(premise, hypothesis))
I have tried using the scores we get as output, but not sure how to calculate the probability from them. Same goes for the last hidden states that can be fetched as the output from the generate()
. I saw in another question on Stack Overflow that suggested using a softmax function on the last hidden states but I am unsure how to do it.
How can I calculate the probability of the sequence being generated? That is, if I get entailment
for a pair of hypothesis and premise, what would be the P(entailment)
?