I'm trying to use a GPT language model and get the weights it assigns to each word in the last state of text generation. My model is a GPT2 from the transformers library. Below is how I call the pretrained model:
tokenizer = AutoTokenizer.from_pretrained(
"HooshvareLab/gpt2-fa-poetry"
)
model = AutoModelForCausalLM.from_pretrained(
"HooshvareLab/gpt2-fa-poetry"
)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
My goal is to use this information from the last layer of this model (a matrix with the length of vocabulary after the softmax activation) and use it in combination with another model.
I'm trying to do this in TensorFlowPlease, but share your comments if you think there are easier and more convenient ways of doing this in PyTorch.