0

Right now I have:

model = GPTNeoForCausalLM.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.cuda()
gen_tokens = model.generate(input_ids, do_sample=specifiedDoSample, output_scores=True, temperature=specifiedTemperature, max_new_tokens=specifiedNumTokens, repetition_penalty=specifiedRepetitionPenalty, top_p=specifiedTopP)
gen_text = tokenizer.batch_decode(gen_tokens)[0]
print(gen_text)

This will print the generated text. However, I want it to list the top N tokens in each step as well as their probability (N being a number specified by me), similar to OpenAI's beta playground where you can select "Show probabilities: Full spectrum". For example, if the prompt is "You are now a", the next token should say something like {"vampire": 51%, "corpse": 32% ... etc.}

What is the easiest way to do this via Huggingface Transformers?

Rubén
  • 34,714
  • 9
  • 70
  • 166
pete
  • 1,878
  • 2
  • 23
  • 43

2 Answers2

2

You need to add ", output_scores=True, return_dict_in_generate=True" in the call to the generate method, this will give you a scores table per character of generated phrase, which contains a tensor with the scores (need to softmax to get the probas) of each token for each possible sequence in the beam search.

Look at generation_utils.py in the transformers source tree, starting at "def generate"

Raph
  • 21
  • 2
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Mar 08 '22 at 13:50
  • Thanks. Don't I also need to specify beam search or sampling and number of runs? To get, say, top 50 next tokens. I am running into this issue: https://github.com/huggingface/transformers/issues/10012 I can sort of use beam search to get the top choices but the probabilities will be wrong – pete Mar 09 '22 at 05:16
  • The beam sampling parameters are defaulted in the model. You can add num_beams, num_beam_groups (not sure what this does), num_return_sequence for the number of runs. There are lots of other parameters for example n_gram interdiction to avoid the generator running into a loop for example, it is recommended to read the docs. I'm currently also looking at character probabilities, and filed this bug report: https://github.com/huggingface/transformers/issues/16053 . – Raph Mar 10 '22 at 16:59
  • @pete, did you resolve this problem. I need the same thing, get the probabilities on each token from generate() – LearnToGrow May 09 '22 at 19:25
  • Hi @LearnToGrow I have just posted an answer – pete May 10 '22 at 00:26
0

A potential workaround is in the thread https://github.com/huggingface/transformers/issues/10012.

Use beam search as described in the thread, using n beams where n is the number of probs you want to display, but only looking 1 token into the future. Then, according to comment by mshuffett:

I just moved this line below the return_dict_in_generate block.

next_token_scores = next_token_scores + beam_scores[:, None].expand_as(next_token_scores)

I tried it and it worked perfectly. The next single token's probabilities now displayed correctly.

Alternatively you can try the solutions described in https://github.com/huggingface/transformers/issues/16010. I haven't gotten around to it because it looks slightly more involved than the easy workaround.

pete
  • 1,878
  • 2
  • 23
  • 43
  • I am not sure what this code is doing. What I want is the scores corresponding to the token in sequences. It means that by applying softmax() and argmax() on the scores, I get the same sequences indices returned by generate(). Actually, what generate() return is the right scores. – LearnToGrow May 10 '22 at 01:20
  • I'm not sure what you mean and I'm not familiar with any of this code. I solved the issue described in my original question: How to display the probabilities 1 token into the future. If it's not what you were expecting then probably your issue is different. – pete May 10 '22 at 01:36