2

In my use case I am using openai models hosted on azure. I am trying to generate a list of senteces or words with a specific length. Lets take this prompt as an example:

Give 10 Examples of pizza ingredients: 
1. tomatoes
2. mushrooms

The text-davinci-003 model completes the list as expected and stops but the gpt-3.5-turbo model generates tokens until the token limit is reached, even when I tell the model to stop when the task is done. Using few shot prompting also doesn't seem to work here.

Hacky workarounds

  • Using a low value for max_tokens. But it is hard to estimate the value because parts of the prompt will be changed dynamically in the application. And it still needs postprocessing to remove wasted tokens.

  • Put a counter before the examples and then using a specific number as stop sequence. When using a general counter like above then I need to ensure that the stop sequence won't be generated accidentally so that the model stops. When using an unusual counter like "1~~", "2~~"... there is a chance that the model malforms the stop sequence so that it still will be generating until the limit is reached.

Is there a clean and easy solution to let the model stop generating, like text-davinci-003 does?

Talha Tayyab
  • 8,111
  • 25
  • 27
  • 44
dude
  • 79
  • 6

1 Answers1

0

I am also running into same issue and one hack is add the end statement like append --end-- in the end and add that as stop sequence, as lowering the token limit is limiting and not full proof. Looking for why there is difference in behaviour in both models.