When using GPT-4 API, do I need to send the entire conversation back each time?

Question

I'm new to OpenAI API. I work with GPT-3.5-Turbo, using this code:

messages = [
        {"role": "system", "content": "You’re a helpful assistant"}
    ]

    while True:
        content = input("User: ")
        if content == 'end':
            save_log(messages)
            break
        messages.append({"role": "user", "content": content})

        completion = openai.ChatCompletion.create(
            model="gpt-3.5-turbo-16k",
            messages=messages
        )

        chat_response = completion.choices[0].message.content
        print(f'ChatGPT: {chat_response}')
        messages.append({"role": "assistant", "content": chat_response})

Result: User: who was the first person on the moon? GPT: The first person to step foot on the moon was Neil Armstrong, an American astronaut, on July 20, 1969, as part of NASA's Apollo 11 mission. User: how tall is he? GPT: Neil Armstrong was approximately 5 feet 11 inches (180 cm) tall.

But it requires tons of tokens. And I've heard that GPT-4 differs from GPT-3 in that it's able to remember the previous messages (on its own). Is that correct?

But if I remove the line where I append the 'messages' list with the latest one and send only one message: completion = openai.ChatCompletion.create( model="gpt-4", messages=[{"role": "user", "content": content}] ) it can't remember anything.

User: who was the first person on the moon? GPT: The first person on the moon was Neil Armstrong on July 20, 1969. User: how tall is he? GPT: Without specific context or information about who "he" refers to, I'm unable to provide an accurate answer.

So I'm wondering is there any workflow difference between GPT-3.5-Turbo and GPT-4?

Mithsew · Answer 1 · 2023-08-30T18:38:47.207

Yes. It is always necessary to re-send all context. GPT regardless of 3-5 or 4 is still not able to store information via API. There is a thread about that on openai community

My theory is that openAI are waiting for the enterprise chatgpt(which is almost done) to release an model that we can train and memorize information, otherwise this should create millions of independent AI's overnight.

There are some libraries like langchain that simulate this storage and work with files, but in practice they also need to send all messages every time. The difference is that with it you can work with more tokens because before interacting with openai it looks first for the necessary context only.

(Note: If you choose to work with langchain keep in mind that it is quite expensive in terms of cpu and is not as effective)

Similar Threads: 1, 2, 3

When using GPT-4 API, do I need to send the entire conversation back each time?

1 Answers1