0

I'm trying to use few-shot summarization on GPT-NEO, with custom eos_token_id = '###'.

So when I generate the text, the generator has this parameter:

model.generate(inputs, 
    max_new_tokens = 80,  
    eos_token_id = tokenizer.eos_token_id)

The problem is that in some rare cases NOTHING gets generated at all, because '###' somehow gets generated right away after the prompt.

Is there a way to force the model to ignore end of sequence token IF it's the first one generated? So that it never returns NULL?

Andrew T.
  • 4,701
  • 8
  • 43
  • 62
yulGM
  • 894
  • 1
  • 5
  • 14

0 Answers0