0

We've been experimenting back and forth between text completion and Chat completion to build an interactive AI.

What we've found is with Text completion the AI follows instructions much better, but after a number of messages being added to the prompt (e.g. about 8 back and forth sentences of around 90 chars each), the Latency starts to go up. It also increases the token usage (less important but notice it).

Has anyone been able to use Text Completion for long conversations and if so were you able to do it without getting a major latency hit?

Did you need an intermediate step to summarize the previous conversation rather than carry all the messages per request?

0 Answers0