Right now I have:
model = GPTNeoForCausalLM.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.cuda()
gen_tokens = model.generate(input_ids, do_sample=specifiedDoSample, output_scores=True, temperature=specifiedTemperature, max_new_tokens=specifiedNumTokens, repetition_penalty=specifiedRepetitionPenalty, top_p=specifiedTopP)
gen_text = tokenizer.batch_decode(gen_tokens)[0]
print(gen_text)
This will print the generated text. However, I want it to list the top N tokens in each step as well as their probability (N being a number specified by me), similar to OpenAI's beta playground where you can select "Show probabilities: Full spectrum". For example, if the prompt is "You are now a", the next token should say something like {"vampire": 51%, "corpse": 32% ... etc.}
What is the easiest way to do this via Huggingface Transformers?