I'm trying to use few-shot summarization on GPT-NEO, with custom eos_token_id = '###'
.
So when I generate the text, the generator has this parameter:
model.generate(inputs,
max_new_tokens = 80,
eos_token_id = tokenizer.eos_token_id)
The problem is that in some rare cases NOTHING gets generated at all, because '###' somehow gets generated right away after the prompt.
Is there a way to force the model to ignore end of sequence token IF it's the first one generated? So that it never returns NULL?